GitHub - cfoos/HApi-hole

High Availability pi-hole

I wanted a pi-hole that could survive nodes dying while keeping statistics and the same IP but could not find any tutorials on how to do that, so I made HApi-hole pronounced Happy Hole.

To set this up you will need 3 Raspberry pi 4's with 8G RAM and 2 high speed drives per pi. 4G RAM may be doable, but you are going to have issues with usage and need to know how to optimize your system to use as little RAM as possible. The OS drive has to be a fast drive that can handle a lot of writes, but the second drive on each pi could be a normal HDD.

I used:

I had to solder a jumper wire between the 5v gpio and usb since using two drives would draw too much current for the usb controller and cause drives to be dropped.

Image the 3 smaller SSD's with Ubuntu 20.10 64bit for raspberry pi and setup your pi's to boot from USB. You may need to use raspberry OS to update the firmware before USB boot works.

Connect the 3 OS drives into the top usb3.0 ports and the 3 blank ones onto the bottom usb3.0 ports on your pi's.

Make sure you have static IPs. Most these commands are ran on pi01

Configure /etc/hosts on the three pi's so they can access eachother via hostname.

192.168.2.101 pi01 pi01.kube.local
192.168.2.102 pi02 pi02.kube.local
192.168.2.103 pi03 pi03.kube.local

Set the hostnames by running the following on the corrosponding pi's

sudo hostnamectl set-hostname pi01.kube.local
sudo hostnamectl set-hostname pi02.kube.local
sudo hostnamectl set-hostname pi03.kube.local

make needed cgroup changes for k3s

sed -i '$ s/$/ cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory/' /boot/firmware/cmdline.txt

I'm not using wifi or bt on mine, so I disabled those to reduce power usage and set gpu memory usage to 16m

echo -e "dtoverlay=disable-wifi\ndtoverlay=disable-bt\ngpu_mem=16" >> /boot/firmware/config.txt

I overclocked since this is HA and if one pi dies, it is not a big deal, do this at your own risk

echo -e "over_voltage=6\narm_freq=2000\ngpu_freq=750" >> /boot/firmware/config.txt

Now we are going to update and do some installs.

sudo apt update
sudo apt upgrade -y
# if you are adding worker only nodes install only these packages.
sudo apt install libraspberrypi-bin rng-tools haveged net-tools sysstat lm-sensors smartmontools atop iftop -y
# master nodes get those as well as these packages
sudo apt install etcd docker.io ntp snapd ceph-common lvm2 cephadm build-essential golang git kubetail linux-modules-extra-raspi -y

Now we are going to get etcd working on the system. Not needed on 22.04, might still be needed on 20.04.

mkdir /etc/etcd
touch /etc/etcd/etcd.conf
echo "ETCD_UNSUPPORTED_ARCH=arm64" > /etc/etcd/etcd.conf
sed -i '/\[Service]/a EnvironmentFile=/etc/etcd/etcd.conf' /lib/systemd/system/etcd.service

I am using ceph for the HA storage cluster but not using rook so I can keep it seperate and add nodes for only ceph with as little overhead as possible in the future.

mkdir -p /etc/ceph
# when adding the user, use a password you can remember, you will need it later
adduser cephuser

ceph needs no password sudo

usermod -aG sudo cephuser
touch /etc/sudoers.d/cephuser
echo "cephuser ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/cephuser

Now to restart. I like to do an update and upgrade just to be sure I did not forget so I do not have to restart again.

apt update; apt upgrade -y;reboot

Once the system is back up we are going to disable the etcd service so k3s can run its own.

systemctl disable etcd
systemctl stop etcd

Now to get ceph up and running. Replace --mon-ip with the IP you set for pi01

cephadm bootstrap --skip-monitoring-stack --allow-fqdn-hostname --ssh-user cephuser --mon-ip 192.168.2.101

This may take a while, when it is done you will get a login url and admin user/pass. Log in and update the password.

Make sure a few settings are configured

ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf true

Copy the ceph SSH key to the other nodes. This is where you need to remember that password.

ssh-copy-id -f -i /etc/ceph/ceph.pub cephuser@pi02.kube.local
ssh-copy-id -f -i /etc/ceph/ceph.pub cephuser@pi03.kube.local

Add your other nodes

ceph orch host add pi02.kube.local
ceph orch host add pi03.kube.local

Now to set the manager interface to run on all the nodes so you have access. You could pin this and I may see about running it separately in k3s to reduce resource usage later.

ceph orch apply mgr --placement="pi01.kube.local pi02.kube.local pi03.kube.local"

Tell ceph to run the min of 3 monitors on your 3 master nodes. If you add master nodes in the future you can add them here. This is mainly so I can add OSD only nodes in the future and not have to worry about monitor services being started on it allowing me to use 4G RAM or lower nodes depending on the size of the disks.

ceph orch apply mon --placement="3 pi01.kube.local pi02.kube.local pi03.kube.local"

Wait a while for all the hosts to be added. You can see this in the interface you logged into earlier.

Tell ceph to use any available free drives. You can specify the disk if you do not want to do it this way.

ceph orch apply osd --all-available-devices

If you do not see OSD's added in the interface and your cluster go to a healthy state, check if devices are available.

ceph orch device ls

If none of the storage drives are available for use, they may have a file system on them. Ceph requires them to no have any file systems. You can check with

lsblk

Make sure your OS is sda, and if your sdb has a blank fs you can clear it with

ceph orch device zap pi01.kube.local /dev/sdb --force

Do that on all nodes. I find applying osd to all available again speeds up adding the drives.

Once ceph is running and drives added, we can install k3s. Run the following on the first master node.

curl -sfL http://get.k3s.io | sh -s - server --disable servicelb,traefik --cluster-init

Check if the node is in the Ready state.

sudo kubectl get nodes

Once the first master node is in the ready state, run this and run the output on additional master nodes waiting for each one to get to the ready state before doing the next one.

echo "curl -sfL http://get.k3s.io |K3S_TOKEN=`cat /var/lib/rancher/k3s/server/node-token` sh -s - server --disable servicelb,traefik --server https://`hostname -i`:6443"

If you have extra pi's you want to add as k3s worker nodes run the follwoing on the first master and the output on each worker node.

echo "curl -sfL http://get.k3s.io | K3S_URL=https://`hostname -i`:6443 K3S_TOKEN=`cat /var/lib/rancher/k3s/server/node-token` sh -"

I go the extra step of labeling worker nodes.

kubectl label node pi04.kube.local node-role.kubernetes.io/worker=''

Now I want some monitoring on the cluster, and I want my HA storage used so let's set up ceph and k3s access to it.

This will create the cephfs volume on ceph for you.

ceph fs volume create cephfs

Now we need a config map, replace the IPs with the ones you have set for your ceph mon nodes. This should be pi01,02,03 if you did not change anything to this point. Also add the cluster id in the quotes.

cat <<EOF > csi-config-map.yaml
---
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    [
      {
        "clusterID": "ID_HERE",
        "monitors": [
          "192.168.2.101:6789",
          "192.168.2.102:6789",
          "192.168.2.103:6789"
        ]
      }
    ]
metadata:
  name: ceph-csi-config
EOF

Now apply that config

kubectl apply -f csi-config-map.yaml

Recent versions of ceph-csi also require an additional ConfigMap object to define Key Management Service (KMS) provider details.

cat <<EOF > csi-kms-config-map.yaml
---
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    {}
metadata:
  name: ceph-csi-encryption-kms-config
EOF

Now apply that config

kubectl apply -f csi-kms-config-map.yaml

Recent versions of ceph-csi also require yet another ConfigMap object to define Ceph configuration to add to ceph.conf file inside CSI containers

cat <<EOF > ceph-config-map.yaml
---
apiVersion: v1
kind: ConfigMap
data:
  ceph.conf: |
    [global]
    auth_cluster_required = cephx
    auth_service_required = cephx
    auth_client_required = cephx
  # keyring is a required key and its value should be empty
  keyring: |
metadata:
  name: ceph-config
EOF

Now apply that config

kubectl apply -f ceph-config-map.yaml

cat <<EOF > kms-config.yaml
---
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    {
      "vault-tokens-test": {
          "encryptionKMSType": "vaulttokens",
          "vaultAddress": "http://vault.default.svc.cluster.local:8200",
          "vaultBackendPath": "secret/",
          "vaultTLSServerName": "vault.default.svc.cluster.local",
          "vaultCAVerify": "false",
          "tenantConfigName": "ceph-csi-kms-config",
          "tenantTokenName": "ceph-csi-kms-token",
          "tenants": {
              "my-app": {
                  "vaultAddress": "https://vault.example.com",
                  "vaultCAVerify": "true"
              },
              "an-other-app": {
                  "tenantTokenName": "storage-encryption-token"
              }
          }
       }
    }
metadata:
  name: ceph-csi-encryption-kms-config
EOF

kubectl apply -f kms-config.yaml

Now k3s needs a few things to be able to access ceph

kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml
kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/cephfs/kubernetes/csi-cephfsplugin.yaml
kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/cephfs/kubernetes/csi-nodeplugin-psp.yaml
kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/cephfs/kubernetes/csi-nodeplugin-rbac.yaml
kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/cephfs/kubernetes/csi-provisioner-psp.yaml
kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/cephfs/kubernetes/csi-provisioner-rbac.yaml

We need a secret to access this. You should set up a user with access to cephfs, but I am just going to use admin for both admin and user. You can get the admin key from

cat /etc/ceph/ceph.client.admin.keyring

cat <<EOF > csi-cephfs-secret.yaml
---
apiVersion: v1
kind: Secret
metadata:
  name: csi-cephfs-secret
  namespace: default
stringData:
  userID: admin
  userKey: ADMIN_KEY
  adminID: admin
  adminKey: ADMIN_KEY
EOF

kubectl apply -f csi-cephfs-secret.yaml

Now we need to know the cluster id. The output of the following will include a fsid, this is your cluster id, copy it down to modify future commands.

ceph mon dump

Now we need a storage class. Be sure to update the cluster ID. It is the same as what you got before.

cat <<EOF > cephfs-csi-sc.yml
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cephfs-sc
provisioner: cephfs.csi.ceph.com
parameters:
  clusterID: CLUSTER_ID
  fsName: cephfs
  csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: default
  csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: default
  csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/node-stage-secret-namespace: default
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
  - debug
EOF

kubectl apply -f cephfs-csi-sc.yml

Check your storageclass was added.

kubectl get storageclass

Watch and wait for all pods to start addressing any issues before proceeding.

watch "kubectl get pods --all-namespaces"

I like to make this the new default as it just makes things easier.

kubectl patch storageclass csi-cephfs-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

Check storageclass again and make sure only the cephfs one is marked default.

kubectl get storageclass

Now we are going to add metallb and get it setup.

kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.5/config/manifests/metallb-native.yaml
kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

You will need to make a config for it to work. I'm using the layer2 protocol. For the addresses, set a range that is not in your dhcp range. My dhcp is .150 and higher and I do static IPs below .50 so .50-.99 is safe for metallb to assign as it pleases.

cat <<EOF > metallb-config.yml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  namespace: metallb-system
  name: first-pool
spec:
  addresses:
  - 192.168.2.50-192.168.2.99
EOF

kubectl apply -f metallb-config.yml

cat <<EOF > metalLbL2Advertise.yaml
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: example
  namespace: metallb-system
spec:
  ipAddressPools:
  - first-pool
EOF

kubectl apply -f metalLbL2Advertise.yaml

Make sure all your pods start up and now we are on to the actual monitoring part. I use: https://github.com/carlosedp/cluster-monitoring

Read the settings in the vars.jsonnet. Here are a few settings I recommend

      name: 'armExporter',
      enabled: true,

      name: 'armExporter',
      enabled: true,

      name: 'traefikExporter',
      enabled: true,

  enablePersistence: {
    // Setting these to false, defaults to emptyDirs.
    prometheus: true,
    grafana: true,

    storageClass: 'csi-cephfs-sc',

Since cephfs is the default you do not technically need to set the storageclass. I also recommend increasing the size of the volumes if you do not know how to resize a pv and pvc as the defaults are a bit low and filled for me in 2 days. I do have 8 nodes though.

Once all the pods are running it is time to change grafana so you can access it via IP.

kubectl edit services -n monitoring grafana

Your default editor should be vim and you can use sed to update it from ClisterIP to LoadBalancer

:%s/type: ClusterIP/type: LoadBalancer/g

Now check what IP your service is running on.

kubectl get services -n monitoring grafana

In my case I get

NAME      TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
grafana   LoadBalancer   10.43.227.221   192.168.2.50   3000:31370/TCP   9d

and can access grafana at http://192.168.2.50:3000

Now to move on to the pi-hole part of the HA pi-hole

First we need a namespace

kubectl create namespace pihole

Then before we forget, a secret to hold the web interface password. Update "superSecretPass" to the password you want.

cat <<EOF > piholesecret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: pihole-secret
  namespace: pihole
type: Opaque
stringData:
  password: superSecretPass
EOF

kubectl apply -f piholesecret.yaml

We need a persistent volume if we want stats and settings to survive restarts of pods. I set 1G but that is probably more than you need.

cat <<EOF > piholepvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pihole
  namespace: pihole
spec:
  accessModes:
    - ReadWriteMany
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-cephfs-sc
EOF

kubectl apply -f piholepvc.yaml

To deply it, we need a deployment. The env section can use variables from teh pihold documentation. I set the dns resolvers used to quad 1 and quad 8 and the them to dark. https://github.com/pi-hole/docker-pi-hole/#running-pi-hole-docker

cat <<EOF > piholedeployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pihole
  namespace: pihole
  labels:
    app: pihole
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: pihole
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: pihole
        name: pihole
    spec:
      restartPolicy: Always
      containers:
      - name: pihole
        image: pihole/pihole:latest
        imagePullPolicy: Always
        env:
        - name: TZ
          value: "America/Chicago"
        - name: WEBPASSWORD
          valueFrom:
            secretKeyRef:
              name: pihole-secret
              key: password
        - name: PIHOLE_DNS_
          value: "1.1.1.1;8.8.8.8"
        - name: WEBTHEME
          value: "default-dark"
        volumeMounts:
        - name: pihole
          mountPath: "/etc/pihole"
          subPath: "pihole"
        - name: pihole
          mountPath: "/etc/dnsmasq.d"
          subPath: "dnsmasq"
      volumes:
      - name: pihole
        persistentVolumeClaim:
          claimName: pihole
EOF

kubectl apply -f piholedeployment.yaml

Give it a few minutes and you should have a running pod in the pihole namespace.

kubectl get pods -n pihole

Now we need access. Metallb does not allow mixing tcp and udp so we will break those apart and since we are already doing that we might as well break apart the web and dns tcp as well.

cat <<EOF > piholeweb.yaml
---
apiVersion: v1
kind: Service
metadata:
  name: pihole-web
  namespace: pihole
  annotations:
    metallb.universe.tf/allow-shared-ip: pihole-svc
spec:
  selector:
    app: pihole
  allocateLoadBalancerNodePorts: true
  externalTrafficPolicy: Local
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  loadBalancerIP: 192.168.2.99
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
    name: pihole-admin
  sessionAffinity: None
  type: LoadBalancer
EOF

kubectl apply -f piholeweb.yaml

cat <<EOF > piholednsudp.yaml
---
apiVersion: v1
kind: Service
metadata:
  name: pihole-dns-udp
  namespace: pihole
  annotations:
    metallb.universe.tf/allow-shared-ip: pihole-svc
spec:
  selector:
    app: pihole
  allocateLoadBalancerNodePorts: true
  externalTrafficPolicy: Local
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  loadBalancerIP: 192.168.2.99
  ports:
  - port: 53
    targetPort: 53
    protocol: UDP
    name: dns-udp
  sessionAffinity: None
  type: LoadBalancer
EOF

kubectl apply -f piholednsudp.yaml

cat <<EOF > piholednstcp.yaml
---
apiVersion: v1
kind: Service
metadata:
  name: pihole-dns-tcp
  namespace: pihole
  annotations:
    metallb.universe.tf/allow-shared-ip: pihole-svc
spec:
  selector:
    app: pihole
  allocateLoadBalancerNodePorts: true
  externalTrafficPolicy: Local
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  loadBalancerIP: 192.168.2.99
  ports:
  - port: 53
    targetPort: 53
    protocol: TCP
    name: dns-tcp
  sessionAffinity: None
  type: LoadBalancer
EOF

kubectl apply -f piholednstcp.yaml

I update the resolv.conf on the nodes to prevent issues getting the pihole image to start it. You will want to remove the symlink /etc/resolv.conf and replace it with a normal file. Since I set my pihole to the .99 ip, I set this for my resolv.conf:

nameserver 192.168.2.99
nameserver 1.1.1.1
options edns0 trust-ad
search localdomain

With this the pods will try your pihole first, then fail over to 1.1.1.1

Now you should have pi-hole at http://192.168.2.99/admin/

You can test the HA aspect by finding which node the services is running on and shutting it down. Your ceph cluster should give you warnings but still work, and dns should work again in 8 minutes. There is a default 5 minute wait, then for me it takes 3 min to start on the new node.

I left ceph on defaults for cephfs in this case since with 3 nodes it will do a 3 copy replication meaning that each of your nodes has a copy of the data and with multi-master on k3s your pi-hole should survive 2 nodes dying.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 3

License

cfoos/HApi-hole

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages