Cook up and scale a k3s Kubernetes cluster on Raspberry pies with a single command.
All the code is available at my Github repo PiClusterChef
Prerequisites
- Install the Raspbian OS (64-bit) on your Raspberry Pies, connect them to your local network, give them a hostname, and enable their SSH servers. There is an excellent guide by w3Schools on how to do this here. (I prefer to use the official Raspberry Pi Imager to flash the SD Cards instead of Etcher, which is used by w3schools, but you can use either.)
- Install Ansible on your local machine. (I use the WSL 2 subsystem in Windows to run Ansible.)
- Install kubectl.
- Optional: Install Helm.
Ansible setup and components
To provision our cluster with ansible we need an inventory file, playbook file(s), and some initial SSH setup since ansible uses SSH to connect to nodes.
Inventory file
In Ansible, the inventory file is a file that contains a list of target hosts or nodes that Ansible can manage. In this case, it will contain all our Raspberry Pis.
workers:
hosts:
nerminworker1:
ansible_host: 192.168.0.66
ansible_user: nermin
masters:
hosts:
# This hostgroup is designed to only contains the initial bootstrap master node
bootstrapMaster:
hosts:
nerminmaster:
ansible_host: 192.168.0.67
ansible_user: nermin
pies:
children:
bootstrapMaster:
masters:
workers:
vars:
k3s_version: v1.24.10+k3s1
We have four groups in our setup:
-
bootstrapMaster
: This is the first node that will get a k3s server and will create the token used by other nodes to register with the cluster. -
masters
: These are the remaining k3s server nodes that will register themselves via thebootstrapMaster
. -
workers
: These are the k3s worker/agent nodes that will run workloads in the cluster. They also register - themselves via thebootstrapMaster
and the generated k3s token. -
pies
: This group contains all the masters and workers combined.
I have two Raspberry Pis in my setup, even though the picture shows three. The third Raspberry Pi has another purpose not related to the k3s cluster. Both nodes have the user nermin
. nermin
is used to execute commands on each of the hosts. A single variable k3s_version
is used to determine which version of k3s we want to install.
SSH connections
Add your public SSH key to each Raspberry Pi to avoid being prompted for passwords when creating sessions.
$ ssh-copy-id <user>@<host>
In this specific case:
$ ssh-copy-id nermin@nerminmaster
$ ssh-copy-id nermin@nerminworker1
You can also use the IP address as the host instead of the hostname.
Note: Make sure you have generated an SSH key pair in your WSL2 system. You can check by running:
$ ls ~/.ssh/id_*.pub
If you don't see any files listed, you'll need to generate an SSH key pair. You can do this by running:
$ ssh-keygen
Follow the prompts to generate a new key pair. Make sure to keep your private key secure and do not share it with others.
Install Playbook
The install-k3s-playbook.yaml file contains plays for installing k3s masters and workers on the Raspberry Pis. This includes enabling memory cgroups as required by the documentation. The playbook also retrieves the kubeconfig file from the bootstrap node and places it in the current directory with the name k3sconfig.
- name: Enable cgroups
become: true
hosts: pies
tasks:
- name: Ping hosts
ansible.builtin.ping:
- name: Check if cgroups are enabled
command: cat /boot/cmdline.txt
register: cmdlineContent
- name: Enable cgroups
command: sed -i -e 's/$/ cgroup_memory=1 cgroup_enable=memory/' /boot/cmdline.txt
when: "'cgroup_memory=1 cgroup_enable=memory' not in cmdlineContent.stdout"
notify:
- Restart pi
handlers:
- name: Restart pi
ansible.builtin.reboot:
- name: Install k3s bootstrap server
become: true
hosts: bootstrapMaster
tasks:
- name: Ping host
ansible.builtin.ping:
- name: Install k3s bootstrap server
shell: curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION={{ k3s_version }} K3S_NODE_NAME={{ inventory_hostname }} K3S_KUBECONFIG_MODE="644" sh -s - server --cluster-init
- name: Extract K3S_TOKEN from server output
command: cat /var/lib/rancher/k3s/server/node-token
register: k3s_token
failed_when: k3s_token is failed or k3s_token.stdout is undefined
- name: Set K3S_TOKEN as a fact
set_fact:
k3s_token: "{{ k3s_token.stdout }}"
- name: Install k3s servers
become: true
hosts: masters
tasks:
- name: Ping hosts
ansible.builtin.ping:
- name: Install k3s servers
shell: curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION={{ k3s_version }} K3S_URL=https://{{ hostvars['nerminmaster']['ansible_default_ipv4'].address }}:6443 K3S_TOKEN={{ hostvars['nerminmaster']['k3s_token'] }} K3S_NODE_NAME={{ inventory_hostname }} sh -s - server
- name: Install k3s workers
become: true
hosts: workers
tasks:
- name: Ping hosts
ansible.builtin.ping:
- name: Install k3s workers
shell: curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION={{ k3s_version }} K3S_URL=https://{{ hostvars['nerminmaster']['ansible_default_ipv4'].address }}:6443 K3S_TOKEN={{ hostvars['nerminmaster']['k3s_token'] }} K3S_NODE_NAME={{ inventory_hostname }} sh -
- name: Fetch k3s kubeconfig
become: true
hosts: bootstrapMaster
tasks:
- name: Fetch kubeconfig
fetch:
src: /etc/rancher/k3s/k3s.yaml
dest: k3sconfig
flat: true
Uninstall Playbook
The uninstall-k3s-playbook.yaml uninstalls the k3s services and scripts on each Raspberry Pi. This enables easy cleanup.
- name: Uninstall k3s on workers
become: true
hosts: workers
tasks:
- name: Ping hosts
ansible.builtin.ping:
- name: Uninstall k3s agent
command: /usr/local/bin/k3s-agent-uninstall.sh
- name: Uninstall k3s on servers
become: true
hosts: masters
tasks:
- name: Ping hosts
ansible.builtin.ping:
- name: Uninstall k3s server
command: /usr/local/bin/k3s-uninstall.sh
- name: Uninstall k3s on bootstrap servers
become: true
hosts: bootstrapMaster
tasks:
- name: Ping hosts
ansible.builtin.ping:
- name: Uninstall k3s server
command: /usr/local/bin/k3s-uninstall.sh
Create the k3s cluster
Now for the fun part. Create the k3s cluster by executing the install playbook:
$ ansible-playbook -i inventory.yaml install-k3s-playbook.yaml
That's it! The output should look like this (my Raspberry Pis have cgroups enabled and are therefore skipping that task):
PLAY [Enable cgroups] ******************************************************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************************************
ok: [nerminworker1]
ok: [nerminmaster]
TASK [Ping hosts] **********************************************************************************************************************************************************************************************************************************************************************
ok: [nerminmaster]
ok: [nerminworker1]
TASK [Check if cgroups are enabled] ****************************************************************************************************************************************************************************************************************************************************
changed: [nerminworker1]
changed: [nerminmaster]
TASK [Enable cgroups] ******************************************************************************************************************************************************************************************************************************************************************
skipping: [nerminmaster]
skipping: [nerminworker1]
PLAY [Install k3s bootstrap server] ****************************************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************************************
ok: [nerminmaster]
TASK [Ping host] ***********************************************************************************************************************************************************************************************************************************************************************
ok: [nerminmaster]
TASK [Install k3s bootstrap server] ****************************************************************************************************************************************************************************************************************************************************
changed: [nerminmaster]
TASK [Extract K3S_TOKEN from server output] ********************************************************************************************************************************************************************************************************************************************
changed: [nerminmaster]
TASK [Set K3S_TOKEN as a fact] *********************************************************************************************************************************************************************************************************************************************************
ok: [nerminmaster]
PLAY [Install k3s servers] *************************************************************************************************************************************************************************************************************************************************************
skipping: no hosts matched
PLAY [Install k3s workers] *************************************************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************************************
ok: [nerminworker1]
TASK [Ping hosts] **********************************************************************************************************************************************************************************************************************************************************************
ok: [nerminworker1]
TASK [Install k3s workers] *************************************************************************************************************************************************************************************************************************************************************
changed: [nerminworker1]
PLAY [Fetch k3s kubeconfig] ************************************************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************************************
ok: [nerminmaster]
TASK [Fetch kubeconfig] ****************************************************************************************************************************************************************************************************************************************************************
changed: [nerminmaster]
PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************************************
nerminmaster : ok=10 changed=4 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
nerminworker1 : ok=6 changed=2 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
The playbook downloaded the kubeconfig file from the bootstrap master to the current directory on the local machine with the name k3sconfig. The file looks like this:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: ...
server: https://127.0.0.1:6443
name: default
contexts:
- context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
user:
client-certificate-data: ...
client-key-data: ...
Replace the localhost IP in the server
value with the IP or hostname of the bootstrap master node.
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: ...
server: https://nerminmaster:6443 # <----- This value here.
name: default
contexts:
- context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
user:
client-certificate-data: ...
client-key-data: ...
Verify that the cluster is up and running:
$ kubectl get nodes --kubeconfig k3sconfig
NAME STATUS ROLES AGE VERSION
nerminmaster Ready control-plane,etcd,master 8m42s v1.24.10+k3s1
nerminworker1 Ready <none> 8m7s v1.24.10+k3s1
Success.
Optional: Deploy rancher server to k3s
It would be nice if we had a UI we could use to view and manage our cluster. Rancher server comes with this UI!
To deploy the Rancher server, execute the following:
## CERT MANAGER
# Add jetstack helm repo
helm repo add jetstack https://charts.jetstack.io
helm repo update
# Create cert-manager namespace
kubectl create namespace cert-manager --kubeconfig k3sconfig
# Install cert-manager
helm upgrade --install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v1.8.2 \
--set installCRDs=true \
--wait \
--kubeconfig k3sconfig
## RANCHER SERVER
# Create cattle-system namespace
kubectl create ns cattle-system --kubeconfig k3sconfig
# Add rancher-latest helm repo
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm repo update
# Install rancher
helm upgrade --install rancher rancher-latest/rancher \
--version 2.7.0 \
--namespace cattle-system \
--set hostname=nermin.cluster \
--set replicas=1 \
--set rancherImageTag=v2.7.2-rc5-linux-arm64 \
--wait \
--kubeconfig k3sconfig
Create a record in your hosts file that points to a worker node's IP. On Windows, this file is located at C:/Windows/System32/drivers/etc/hosts
. In the file, add the record:
192.168.0.66 nermin.cluster
Notice that this domain is the same as the hostname
value in the Rancher chart.
Go to the domain nermin.cluster
. There is not a trusted certificate, so the browser will warn us against an untrusted site. Press Advanced and then Continue to nermin.cluster (unsafe). Now, you will enter the login page.
To acquire the bootstrap password, run:
$ kubectl get secret \
--namespace cattle-system bootstrap-secret \
-o go-template='{{.data.bootstrapPassword|base64decode}}{{"\n"}}' \
--kubeconfig k3sconfig
qstl5v7n4j6tmmr89rnhhb6qcf69krfljnwxm96sds2xpgbm4zp4zf
Copy and paste the password into the input field, and you will be allowed to enter the Rancher server UI.
Clean up
Run the uninstall-k3s-playbook to remove k3s from all the nodes:
$ ansible-playbook -i inventory.yaml uninstall-k3s-playbook.yaml
Remarks
This setup allows easy scaling of the cluster up and down by only adding or removing master and worker nodes from the inventory file. However, there are some important points to be aware of:
- Dynamic IPs: The Raspberry Pis have dynamic IPs on the local network, meaning that their IPs can change when their current IP lease expires. This can disrupt the execution of the playbook. Consider assigning static IPs to the nodes.
-
Local domain: Because the
nermin.cluster
domain has been defined in the hosts file, it can be resolved but only on the local machine. This is good for testing purposes, but if the cluster needs to be available to others on the same network, a real domain should be purchased and used. -
Untrusted certificate: The
nermin.cluster
domain is only resolvable on the local machine. No certificate has been signed by a trusted party for this domain, and therefore, it isn't trusted by the browser. If you plan on allowing access to other users, you should change the domain and acquire a certificate. The cert-manager chart can help you get free Let's Encrypt certificates. -
Hardcoded values: In order to acquire information from the bootstrap node like its IP address and the k3s token, we reference them by hardcoding the hostname in the hostvars dict
hostvars['nerminmaster']
. This means that a change to the hostname in thebootstrapMaster
group will make the playbook fail. A change in any of the places requires a change in both the inventory.yaml and install-k3s-playbook.yaml.
Top comments (0)