When you want to add, remove, or resize the node pool in Kubernetes, the ability to quickly bootstrap these nodes and have them rejoin the cluster is pretty important to be able to do in a repeatable, time-efficient way.
Terraform manages state in such a way that deploying your nodes as part of a pool is one reason I love using it to assist in automating cluster operations. In my Terraform variables, for example, I might have variables like kube_version
or count
that can change, and when you plan and apply your new state, Terraform will attempt to reconcile this state change with your provider.
In the example of this Terraform project that deploys to DigitalOcean:
https://bitbucket.org/jmarhee/cousteau/src
which you can read more about using this repo here, and in kube-node.tf
, we can modify it to define a pool like this:
data "template_file" "node" {
template = "${file("${path.module}/node.tpl")}"
vars {
kube_token = "${random_string.kube_init_token_a.result}.${random_string.kube_init_token_b.result}"
primary_node_ip = "${digitalocean_droplet.k8s_primary.ipv4_address}"
kube_version = "${var.kubernetes_version}"
}
}
resource "digitalocean_droplet" "k8s_node" {
name = "${format("pool1-%02d", count.index)}"
image = "ubuntu-16-04-x64"
count = "${var.count}"
size = "${var.primary_size}"
region = "${var.region}"
private_networking = "true"
ssh_keys = "${var.ssh_key_fingerprints}"
user_data = "${data.template_file.node.rendered}"
}
to create a pool of nodes, size of ${var.count}
nodes, with the name pool1
.
When you run terraform plan
, you get a state that reflects this:
+ digitalocean_droplet.k8s_node_standby
id: <computed>
backups: "false"
disk: <computed>
image: "ubuntu-16-04-x64"
ipv4_address: <computed>
ipv4_address_private: <computed>
ipv6: "false"
ipv6_address: <computed>
ipv6_address_private: <computed>
locked: <computed>
memory: <computed>
monitoring: "false"
name: "poolv1-00"
price_hourly: <computed>
price_monthly: <computed>
private_networking: "true"
region: "tor1"
resize_disk: "true"
size: "4gb"
ssh_keys.#: "1"
ssh_keys: ""
status: <computed>
user_data: ""
vcpus: <computed>
volume_ids.#: <computed>
and when you plan and apply with count
variables adjusted, it will add or remove resources of that type accordingly.
However, one great feature of Kubernetes is the ability to cordon and drain nodes that you wish to remove from rotation, so let's say you'd like to deploy a new node pool using a new kube_version
value (i.e. upgrading from v1.11 to 1.12). In node.tf
, add a new pool, call it poolv2
like this:
resource "digitalocean_droplet" "k8s_node" {
name = "${format("pool2-%02d", count.index)}"
image = "ubuntu-16-04-x64"
count = "${var.count}"
size = "${var.primary_size}"
region = "${var.region}"
private_networking = "true"
ssh_keys = "${var.ssh_key_fingerprints}"
user_data = "${data.template_file.node.rendered}"
}
Plan, and apply, and you'll have two sets of nodes, those named pool1, and those named pool2:
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
digitalocean-k8s-pool1-00 Ready <none> 4m45s v1.12.3
digitalocean-k8s-pool1-01 Ready <none> 4m42s v1.12.3
digitalocean-k8s-pool1-02 Ready <none> 4m48s v1.12.3
digitalocean-k8s-pool1-03 Ready <none> 4m54s v1.12.3
digitalocean-k8s-pool1-04 Ready <none> 4m51s v1.12.3
digitalocean-k8s-pool2-00 NotReady <none> 3m27s v1.13.1
digitalocean-k8s-primary Ready master 7m1s v1.13.1
On your Kubernetes cluster, at this point, you can do an operation like this:
kubectl cordon digitalocean-k8s-pool1-00
on all of the pool1
nodes, and then, once they are all cordoned, drain them to move the workloads to the new pool to avoid having to force Kubernetes to reschedule resources (though this is a valid pattern as well):
kubectl drain digitalocean-k8s-node-00 --ignore-daemonsets
When you do an operation like:
kubectl describe node digitalocean-k8s-node-00
You should see Pods terminating, or gone completely, and see them scheduled on the pool2 nodes.
At this point, you can set the count
parameter back in node.tf
to 0, and then plan and apply Terraform to terminate this pool.
To make this configuration more robust, let's take the example of blue/green deployments, and assume we'll always have, at least, two pools and modify node.tf
to look like this:
data "template_file" "node" {
template = "${file("${path.module}/node.tpl")}"
vars {
kube_token = "${random_string.kube_init_token_a.result}.${random_string.kube_init_token_b.result}"
primary_node_ip = "${digitalocean_droplet.k8s_primary.ipv4_address}"
kube_version = "${var.kubernetes_version}"
}
}
resource "digitalocean_droplet" "k8s_node_pool_blue" {
name = "${format("${var.cluster_name}-node-blue-%02d", count.index)}"
image = "ubuntu-16-04-x64"
count = "${var.count_blue}"
size = "${var.primary_size}"
region = "${var.region}"
private_networking = "true"
ssh_keys = "${var.ssh_key_fingerprints}"
user_data = "${data.template_file.node.rendered}"
}
resource "digitalocean_droplet" "k8s_node_pool_green" {
name = "${format("${var.cluster_name}-node-green-%02d", count.index)}"
image = "ubuntu-16-04-x64"
count = "${var.count_green}"
size = "${var.primary_size}"
region = "${var.region}"
private_networking = "true"
ssh_keys = "${var.ssh_key_fingerprints}"
user_data = "${data.template_file.node.rendered}"
}
and update vars.tf
to handle this more sensitively:
-variable "count" {
- default = "3"
- description = "Number of nodes."
+variable "count_blue" {
+ description = "Number of nodes in pool blue."
+}
+
+variable "count_green" {
+ description = "Number of nodes in pool green."
so we have a count_blue
and count_green
to manage these pool sizes, and manage this in your terraform.tfvars
file to scale up and down. Your new process would be to, for example, if pool_blue
is active at 3 nodes, scale pool_green
up to 3 as well, cordon and drain pool_blue
's nodes, verify services are up on pool_green
, and then scale pool_blue
to 0 and apply.
This, for example, would be a good refactor candidate; pools could be managed as a Terraform module, and then you can run multiple types of pools, for example, using different instance types (high CPU, high performance storage, etc.) and take a lot of these options that apply to all node pools in our example into a new template state.
Further Reading
Safely Drain a Node while Respecting Application SLOs
Writing Reusable Terraform Modules
Deploying Kubernetes (and DigitalOcean Cloud Controller Manager) on DigitalOcean with Terraform
Cousteau - Manage Kubernetes + DO CCM on DigitalOcean with Terraform
Top comments (2)
Is this option available in Google Cloud GKE??
Google Cloud does have a Terraform provider, so presumably, yes, you can adapt this to use the Google provider, or the cloud provider of your choosing, and replicate this behavior.