I am a DevOps engineer at Cloudify.co and I will share in this post my experience related to automation of Vault backup creation using Kubernetes CronJob.
This post is a continuation of the previous post: https://igorzhivilo.com/vault/scheduled-backup-vault-secrets
The repository with all the code: https://github.com/warolv/vault-backup
What is HashiCorp's Vault?
Vault is a tool for securely accessing secrets. A secret is anything that you want to tightly control access to, such as API keys, passwords, certificates, and more. Vault provides a unified interface to any secret while providing tight access control and recording a detailed audit log.
My Setup
- EKS Kubernetes cluster
- Vault runs on EKS cluster
What you will learn from this post?
How to create a scheduled backup for Vault secrets with CronJob of Kubernetes.
How to add Prometheus alerts for failed jobs.
You can find all the code presented in my repository: https://github.com/warolv/vault-backup
Let's start.
Building the docker container
First, need to build a docker container based on python3 and include the code of vault_handler.py
Need clone the repo first with Docker file: 'git clone https://github.com/warolv/vault-backup'
Docker file:
FROM python:3
COPY requirements.txt /
RUN pip install -r requirements.txt
COPY vault_handler.py /
CMD [ "python", "./vault_handler.py" ]
Building image
# login to dockerhub
$ docker login -u YOUR_USERNAME -p YOUR_PASSWORD
# Build Docker
$ docker build -t vault-backup .
Validate docker container working properly
$ docker run --name test-vault-backup --rm vault-backup
Specify one of the commands below
print
print-dump
dump
populate
It's working, we got a list of commands from vault-backup:-)
Pushing vault-backup docker container to docker hub
$ docker tag vault-backup <Your Docker ID>/vault-backup:latest
$ docker push <Your Docker ID>/vault-backup:latest
In my case, it's 'warolv/vault-backup:latest', you can find an already built image there.
CronJob to run vault-backup on a daily basis
https://github.com/warolv/vault-backup/blob/main/examples/cronjob/cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: vault-backup
spec:
schedule: "0 1 * * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
nodeSelector:
instance-type: spot
containers:
- name: awscli
image: amazon/aws-cli:latest
command:
- "aws"
- "s3"
- "cp"
- "/data/vault_secrets.enc"
- "s3://jenkins-backups/vault_secrets.enc"
imagePullPolicy: Always
envFrom:
- secretRef:
name: aws-creds-secret
volumeMounts:
- name: backup-dir
mountPath: /data
initContainers:
- name: vault-backup
image: warolv/vault-backup
command:
- "python3"
- "vault_handler.py"
- "dump"
- "-dp"
- "/data/vault_secrets.enc"
imagePullPolicy: Always
envFrom:
- secretRef:
name: vault-backup-secret
volumeMounts:
- name: backup-dir
mountPath: /data
volumes:
- name: backup-dir
emptyDir: {}
Explanation
First 'vault_backup.py' script will run from InitContainer and secrets dump will be created (vault_secrets.enc) and saved to /data folder which is a shared folder for both containers.
The second will run 'awscli' container which will be used to push the secrets dump to a private S3 bucket (AWS CLI is used to copy the secrets dump to the privare S3 bucket). Of course, S3 private bucket must exist.
Credentials for AWS CLI (AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY) and for vault_backup script exported to the environment as k8s secrets.
In this example, I am copying the dump to 's3://jenkins-backups/vault_secrets.enc', in the production use case I suggest adding a timestamp to dump of secrets to be something like vault_secrets_${timestamp}.enc
Creating secrets for CronJob
# create k8s secret for AWS
$ kubectl create secret generic aws-creds-secret \
--from-literal=AWS_ACCESS_KEY_ID=YOUR_AWS_ACCESS_KEY_ID \
--from-literal=AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET_ACCESS_KEY
# create k8s secret with all needed data for vault-backup
$ kubectl create secret generic vault-backup-secret \
--from-literal=VAULT_ADDR=http://vault.vault.svc.cluster.local:8200 \
--from-literal=ROLE_ID=YOUR_ROLE_ID \
--from-literal=SECRET_ID=YOUR_SECRET_ID \
--from-literal=ENCRYPTION_KEY=ENCRYPTION_KEY \
--from-literal=VAULT_PREFIX=jenkins
It's only an example, you need to put real values.
Deploy vault-backup cronjob
$ kubectl apply -f examples/cronjob/cronjob.yaml
How to trigger a Job from CronJob?
In case you want to test your job is working properly:
$ kubectl create job --from=cronjob/vault-backup vault-backup-001
Adding alerts to Prometheus
I am using kube-state-metrics with Prometheus and we have these metrics available: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/cronjob-metrics.md
Let's add an alert for 'failed job' and for cronjob which 'takes too much time', of course, it's only an example to give you an idea.
groups:
- name: cronjob.rules
rules:
- alert: SlowCronJob
expr: time()-kube_cronjob_next_schedule_time > 1800
for: 30m
labels:
severity: warning
annotations:
description: CronJob {{$labels.namespaces}}/{{$labels.cronjob}} is taking more than 30m to complete
summary: CronJob taking more than 30m
- alert: FailedJob
expr: kube_job_status_failed > 0
for: 30m
labels:
severity: warning
annotations:
description: Job {{$labels.namespaces}}/{{$labels.job}} failed
summary: Job failure
In this post, I described how to automate Vault backup creation using Kubernetes CronJob and a simple python script that I built.
Thank you for reading, I hope you enjoyed it, see you in the next post.
Original story on my blog: https://igorzhivilo.com/vault/scheduled-backup-vault-cronjob/
If you want to be notified when the next post of this tutorial is published, please follow me on Twitter @warolv.
Instagram: @warolv
Top comments (0)