Ever felt frustrated with the complexity of running PostgreSQL on Kubernetes? Manually setting up data volumes, services, deployments, managing backups, managing secrets holding credentials, not to mention high availability... There is a lot of complexity in running stateful applications in Kubernetes. This is especially true if you're aiming to achieve this in a declarative GitOps way.
There is a way to simplify this process, to deploy PostgreSQL declaratively, and have a production-ready setup with just a few lines of declarative yaml.
That's where the Crunchy Data Postgres Operator comes in.
In this article, let's deploy PostgreSQL declaratively, creating a production-ready setup on Kubernetes.
If you'd rather watch than read:
Installing Crunchy Data Postgres Operator
To follow along, you need to have a kubernetes cluster running. On my local machine I will be using Rancher Desktop - see my video on that topic. Or you can do so in the cloud or even in your own homelab, and wouldn’t you know, I have a video on how to easily set up a kubernetes homelab as well.
Let’s get back to the task at hand. Installing PGO is not that complicated, however it presented a few challenges when I initially started out, especially since I was a beginner in Kubernetes back then. So I am aiming to simplify this process for you.
To help us out Crunchy has prepared a github repo with code examples. You can find that in the quick start guide from the documentation.
Let’s clone that repo locally and see what’s inside.
There are a couple of ways to install PGO, or actually most things in kubernetes. One is through Helm and one is through Kustomize.
I will use Kustomize, it is the closest thing to native kubernetes manifests and I believe the easiest to figure out what exactly is being installed.
You'll want to check out the folder conveniently named install
. And yet again, there are a couple of ways of installing it. One is the single namespace install, where PGO will manage Postgres instances in, well, a single namespace, and the other one is a cluster-wide installation where PGO can manage postgres instances in any namespace in the cluster. The main difference is in the permissions you give PGO within your cluster.
This type of cluster-wide operator is important and we will see that later on we'll talk about the interesting topic of Platform Development.
The cluster-wide installation manifests can be found in the default
folder.
Here we can check the manifest file, it references other manifests, but we can understand what it’s trying to do: install a CRD or a custom resource definition, which is how you extend the kubernetes API, it applies some role based access controls (RBAC), and some thing called “manager” which is just a deployment, a pod that will run an image with the code that makes all of this possible.
OK let’s try to apply these manifests, kubectl apply -k ./kustomize/install/default
and here we get the first error. The annotation is too long. To fix it you just need server-side apply. I added the --force-conflicts flag as well but that’s optional. Here is the final command:
kubectl apply -k ./kustomize/install/default --server-side --force-conflicts
And that’s it, we now have the Crunchy Data Postgres Operator installed!
Deploying PostgreSQL using PGO
In the examples repository you can find manifests that showcase the options you have when deploying PostgreSQL. Let's take a look at /kustomize/postgres/postgres.yaml
.
With PGO installed we can now create a custom resource of kind PostgresCluster
.
In the manifest we can define the Postgres image and version to use, in this case version 16.
We can have multiple instances in the same manifest and each instance can be a replica set with multiple replicas.
For example here there is a single instance named instance1
.
We need to allocate some storage - 1 gb should be enough.
Another required field to have a minimal postgres cluster is the backup
configuration. It’s nice that they "force" you to have at least a minimal backup configuration.
In a production scenario the backups should be stored at least on a separate drive or even better, offsite or in the cloud.
There are a bunch of options you can configure here, and after we apply the manifest, PGO will take care of backups automatically.
As you can see there are a few pods that started, you can see one pod for that postgres instance. There is also a backup job that ran immediately and now it’s stopped. It also created some kubernetes services so you can connect to this postgres instance.
Speaking of connecting to postgres, one of the best things about this operator is that it manages credentials in kubernetes secrets automatically!
You should see a secret for each user configured for the database. Let’s check one of these secrets out to see what’s in there: a few properties, basically everything you need to successfully connect to the database!
No need to manually configure secrets, this is all done for us!
Actually let me show you how easy it is to just create a user and a new database.
There is a users
key in the manifest, and we can give it a name, we can specify the databases this new user has access to. The nice thing is that if the database doesn’t exist PGO will create it. Let’s call it testdb. And we can give this new user SUPERUSER access:
users:
- name: test
databases:
- testdb
options: SUPERUSER
Now it’s as simple as applying this new manifest and there we go, a new secret has been created for this new user with all the credentials.
More importantly behind the scene, all the required operations were done for us, creating the database, creating the user and so on. And the beauty of it all is that it’s all declarative!
Deploying PGAdmin
I’d really love to test all of this in a visual way, if only there was a UI interface to see what’s going on in a Postgres database. Oh wait, PGO can also manage a PGAdmin installation for us! Let’s check out the manifest for that: ./kustomize/pgadmin/pgadmin.yaml
As you can see the other custom resource that this operator installed is of kind PGAdmin
.
Let’s just apply this example manifest. We can see a pod PGAdmin was created, let’s port forward and open it in a browser.
Credentials for the PGAdmin web interface are also stored in a kubernetes secret automatically.
High Availability
Scaling a PostgreSQL instance with PGO is as easy as flipping an integer:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-ha
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.3-1
postgresVersion: 16
instances:
- name: pgha1
replicas: 2
Just change the number of replicas to whatever you need, and PGO will take care of it. See ./kustomize/high-availability/ha-postgres.yaml
for a more complete example that also includes pgbouncer
.
What is a Kubernetes Operator?
Kubernetes is great at managing stateless applications, it has everything you need out of the box. But stateful applications like PostgreSQL, which store persistent data and require special care and procedures to run, that’s a bit more tricky. This is where Kubernetes Operators come in. They are software extensions that enhance Kubernetes' capabilities, specifically designed to automate the management of complex, stateful applications like PostgreSQL databases.
Like we saw earlier, operators extend the kubernetes API with new types of resources, in this case we had the PostgresCluster
and the PGAdmin
resource types.
The operators usually also run a container in the background and can handle certain administrative tasks for you, for example they can scale Postgres instances, automate backup and restores, disaster recovery etc.
And most importantly they can do so in a declarative way!
Platform Development
This leads me to the other topic I wanted to cover in the video and that is Platform Development.
This is a concept where you have a platform team that manages a Kubernetes cluster that has all of these operators installed.
This platform then enables self-provisioning and managing of resources without complex scripting, manual actions - it’s all done through declarative configuration!
This is especially important for a few use cases, such as:
- Spinning up new environments like staging or test environments
- Accelerating development cycles and productivity
- Faster iteration on new ideas
- Implementing GitOps principles ensuring consistency, repeatability and auditability.
So there you have it! The Crunchy Data Postgres Operator takes the complexity out of running PostgreSQL on Kubernetes. Head over to their GitHub page and give it a try. I'd love to hear about your experiences in the comments below!
Don't forget to subscribe to my youtube channel for more content like this!
Top comments (0)