Cloud Lessons: Launching a K3S Cluster

I’m starting a new series where I share my experiences exploring cloud-native/platform engineering tools and technologies. starting with building the foundation: a Kubernetes installation in the cloud.

Why am I doing this?

to keep these skills sharp as I anticipate using them on client engagements!
to share lessons learned, for your benefit!
Because it’s fun! 😀

Today’s mission: get a simple Kubernetes cluster online in the cloud, using infrastructure-as-code! Since I’m an SRE practitioner, I want to do as little by hand as possible.

Before we begin, here are some caveats:

I’m not planning for true high availability (yet). The point is to explore the technology, not build and operate a production system.
I’m learning as I go, so I will probably approach certain things incorrectly. Best practices come later!

With that out of the way, it’s time to make some decisions:

Kubernetes distribution: I’ll use K3S. It’s lightweight and will be sufficient for my needs.
Cloud provider: Hetzner using ARM64 VMs because I’m cheap 😛. I also want to explore if I bump into any issues using ARM64 over x86_64. I will be launching a simple 2-node cluster consisting of one K3S server/agent and one K3S agent.
Cloud infrastructure-as-code: Terraform, as there is a Hetzner provider available to instantiate cloud resources
Server configuration management: Ansible, as it doesn’t require a persistent server to run

Launching Hetzner VMs Using Terraform

This wasn’t super difficult, as there were some guides online on how to use the Hetzner provider. I simply referenced those when instantiating the network subnet, storage, instances, and load balancer which we’ll use in a later post.

One of the challenges I encountered early on was the need for Terraform to create an inventory file, which contains all the newly-launched servers that I will configure later using Ansible. Googling for this revealed multiple options:

terraform-inventory: this doesn’t work for recent versions of Terraform as the state file format has changed.
Apparently, RedHat wrote an Ansible provider but I wasn’t sure about building a tight integration between Terraform and Ansible just yet.
Templating out the inventory file directly using the ‘local_file’ resource in Terraform.

I chose the latter and generated a .ini style inventory file:

resource "local_file" inventory {
  filename="../ansible/inventory/hosts.ini"
  content = <<EOF
[k8s]
%{ for server in hcloud_server.k8s ~}
${server.name} ansible_host=${server.ipv4_address}
%{ endfor ~}
EOF

I then wrote just enough instructions for cloud-init to create a user for me, configure SSH, install a firewall and open port 22, and install my public key- the essentials needed for Ansible to be able to take over.

I then ran terraform apply and launched the infrastructure in a matter of moments! Progress. Now to get the servers to do something useful!

Installing/Configuring K3S using Ansible

This is where most of my time was spent- iterating many times on my Ansible config until the servers successfully installed and configured K3S and assembled themselves into a cluster.

Configuring The Servers’ Firewalls

In order for K3S agents to successfully join a cluster, they need to be able to contact their peers over several TCP/UDP ports. I started by creating a rule that allowed all access from hosts on the local subnet. It took a couple of hours to realize that wasn’t enough.

What I needed to realize is that as part of a Kubernetes deployment, you have something called a Container Network Interface(CNI), which allows containers across the cluster to talk to each other in a single flat virtual network. This means that I will need to modify my firewall rules to allow traffic on that subnet as well! Running ip addr or ifconfig revealed that Flannel, the CNI used by K3S, was creating virtual network interfaces addressed on the 10.42.0.0/16 subnet, so I told Ansible to allow all traffic from it:

- name: allow all traffic from 10.42
  	community.general.ufw:
    	rule: allow
    	src: 10.42.0.0/16

UFW is a pretty nice change of pace from my time handcrafting iptables rules!

Writing an Installation Script for K3S

The defacto way to install K3S is the scary curl piped to shell method:

curl -sfL https://get.k3s.io | sh -

Configuration is done using environment variables. How do we get K3S installed, with the following in mind?

Installing the server and the agent requires different options.
We want the VMs to talk to each other over the internal network.
The server token needs to be provided to the agent at install time so that it can join the cluster.
The agent needs to know the IP of the server so that it can join the cluster.
Ansible playbooks need to be written with idempotency in mind. We don’t want to run the installation script again on a host that’s already configured.

I decided to write a shell script template that would be written to the VMs, which would be filled in with the pre-defined server token and the IP address of the first VM (k8s-0) which will be the K3S server:

#!/bin/bash

# If k3s is already installed, just exit silently.
[[ -f "/usr/local/bin/k3s" ]] && exit 0

export K3S_TOKEN={{ k3s_token }}
if [[ "$(hostname -s)" -eq "k8s-0" ]] ; then
  export INSTALL_K3S_EXEC="server --flannel-iface enp7s0 --disable=traefik"
else
  export INSTALL_K3S_EXEC="agent --flannel-iface enp7s0"
  export K3S_URL="https://{{ k8s_server_ip.stdout }}:6443"
fi
/usr/bin/curl -sfL https://get.k3s.io | sh -s -

I then provided the following Ansible to get the internal IP from k8s-0, write out the templated script to the servers using it and the K3S server token, then run the installation so that the server gets launched first, then the agents.

	- name: get ip address of first node
  	run_once: true
  	ansible.builtin.shell: hostname -I | awk '{ print $2 }'
  	register: k8s_server_ip

	- name: write out templated k3s install script
  	ansible.builtin.template:
    	src: tpl/k3s-kickstart.j2
    	dest: /root/k3s-kickstart
    	mode: "0770"
    	owner: root
    	group: root

	- name: install k3s server on first node
  	run_once: true
  	delegate_to: "k8s-0"
  	ansible.builtin.shell: /root/k3s-kickstart

	- name: install k3s server on remaining nodes
  	when: inventory_hostname != "k8s-0"
  	ansible.builtin.shell: /root/k3s-kickstart

(BTW: if you ever need a command to just print your IP addresses, use hostname -I. )

Takeaways

I now have a working two-host k3s cluster in Hetzner (with a load balancer) for the cost of a single cloud VM at other vendors! So far, so good with the ARM64 architecture!

$ ssh k8s-0 "sudo kubectl get nodes"
NAME    STATUS   ROLES                  AGE     VERSION
k8s-0   Ready    control-plane,master   5h47m   v1.27.4+k3s1
k8s-1   Ready    <none>                 5h46m   v1.27.4+k3s1

In retrospect, I don’t know how I could have navigated all of the roadblocks without some Linux systems knowledge:

reading logs using systemd’s journalctl
Writing bash scripts and using the CLI
Linux networking basics (ifconfig/ip, ufw/iptables)

Next time, we’ll figure out how to get the K3S cluster to use the Hetzner load balancer so we can provision TLS certificates and spread the load across multiple VMs when we finally deploy applications!

If you need help to modernize your infrastructure using configuration management and infrastructure-as-code, you know where to find me!

(Image credit: Robert Șerban)

CERTO MODO

Cloud Lessons: Launching a K3S Cluster