TLDR; Use k3d, multi-tenant Azure Services, local docker containers and mirrord for an amazing local development experience.
Introduction
Being a part of a small team (< 10) of full-stack developers with end-to-end responsibility (from requirements till running systems) I have been primarily focusing in the past couple of years more or less on having stable test and production environments, utilizing Kubernetes (AKS) and having a set of microservices whereby individual microservices (pods) can be deployed on a daily basis in a controlled way.
Our tech stack consists of .NET/F#, Kubernetes (AKS1 on Azure), MongoDB (Atlas on Azure) as main OLTP database and Azure Data Explorer2 as a DWH, and many additional Azure services like Key Vault, Event Hubs3, Queues4, Blob Storage5, Azure SignalR for websockets, etc.
But what about the development environment? Well, we have shared Azure services for it - e.g. Key Vault, Event Hubs, Queues and Blob Storage, as well as the DWH, but we do run MongoDB locally as a docker container, and we do run the applications themselves locally, usually started using JetBrains Rider's Run/Debug configuration. And this was working relatively fine as usually a developer needs to open 1 microservice, implement some additional API, write automated tests, then manually test it via UI out of which only those pages are invoked which call the microservice in question.
The above under-invested setup had its issues and we needed something better ..
Disclaimer: Even though this article demonstrates a very specific implementation of a local dev environment based on a perhaps unique combination of technologies which probably nobody else uses, the underlying principles/approaches may be portable also to other tech stacks/setups.
Problem Statement
The problem statement is straightforward: how to set up a development environment for every developer with a "complete" isolation from the other team members, and how to do this in a cost-efficient way.
First, when we started developing more and more our web UIs and native mobile apps, the need arose more and more frequently to be able to perform some manual local testing of end-to-end business processes, which requires firing up a number of microservices, pub-sub of multiple events, and even querying eventually-consistent Read Models.
Second, we do have some integrated authentication and other Traefik middleware functionality in Kubernetes, which cannot be tested locally unless run in a Kubernetes cluster.
And last but not least, some if not most of the Azure services which we use do not have good developer experience, i.e. there are no easy-to-run docker images or similar.
Solution
The solution consists of several approaches:
- install an emulator on the developer machine as a docker container or
- run the service in the cloud (Azure in our case), but in a special/custom "multi-tenant mode". The latter means that many developers can use the Azure services but in a complete isolation from each other.
The sections below contain detailed explanations of the technologies/components and approaches involved.
k3d
k3d (wrapper around k3s) is an excellent and very lightweight Kubernetes running as a docker container. We are speaking about setup time of 20-30 seconds, and a couple of minutes to compile and deploy all our microservices to it. Cluster start/stop is supported, but full cluster deletion and re-creation is almost as easy ;)
k3d registry create $registryName -p 5050
k3d cluster create $clusterName --registry-use $registryName:5050 --k3s-arg "--disable=traefik@server:*" --subnet '172.18.0.0/16'
The local docker registry is useful for pushing to it all locally compiled docker images, to be pulled by Kubernetes upon deployment/pod creation.
K3d comes with traefik out-of-the-box, but as you see above it is disabled in this case, as we need a more customized version of it.
Traefik and all other pods we need get deployed by simply kubectl apply -f
a bunch of yamls (we are not using Helm).
There is one nasty problem with k3d whereby after the developer laptop goes to sleep and gets woken up the DNS mapping host.k3d.internal
does not work anymore. There is a workaround though, namely to specify subnet e.g. 172.18.0.0/16
upon creation and use 172.18.0.1
instead.
Stopping the cluster is done with:
k3d cluster stop cluster-name
starting with:
k3d cluster start cluster-name
and deletion with
k3d cluster delete cluster-name
k3d registry delete registry-name.localhost
Simple and fast!
Multi-tenant Azure services
As mentioned above, ideally all cloud services used should have emulators which can be run locally in docker containers. Unfortunately it seems that Microsoft does not care consistently about developer UX and does not provide emulators running in docker containers for all its services, especially for such central ones like Azure Event Hubs in our case (our messaging backbone).
The other problem with emulators in general is that there is no 100% guarantee that they would behave the same as the "real" cloud service.
And last but not least, some cloud (in particular Azure) services are integrated with each other - e.g. Azure Data Explorer ingests data from Azure Event Hubs, something which would be generally missing from the emulators.
Per-Developer Azure Event Hubs Partitions
The general idea for making Azure Events Hubs "multi-tenant" is simple - write a small wrapper around Azure SDK for Event Hubs so that publishing to event hubs (~topics) and subscribing to them via consumer groups is limited to a specific partition. Every Azure Event Hub in the standard SKU can have up to 32 partitions, which means there can be up to 32 isolated clients (1 partition each) - be it local developer environments, shared environments or CI/CD pipelines.
Every environment has an environment variable e.g. LOCAL_DEV_EVENTHUBS_ASSIGNEDPARTITIONIDS=0,1
which specifies which partitions should be used. If the env var is missing then all available partitions are used, which is the case for the shared test and production environments (using their own dedicated sets of Azure Event Hubs).
How "easy" is it to create a wrapper over the standard SDK? Well, that was relatively easy, based on a nice sample from Microsoft. Unfortunately it required migration away from the WebJobs SDK which was being used so far, and which did not give enough control to the underlying implementation.6
Per-Developer Azure Storage Accounts (incl. Blob Containers and Storage Queues)
When it comes to Azure Storage Accounts - that was easy, every developer got her/his own storage account (1 per every developer) where the naming uses a suffix, e.g. company1stacc4xy
(xy being the developer initials). All blob containers and storage queues created inside are used exclusively be the developer in question. Storage accounts are very cheap.
Note: There is an open-source emulator for Azure Storage called Azurite, which can be run as a docker container, and even though it might support all our requirements, it was simply easier (and very cheap if not free) to provision storage accounts directly in the cloud.
Per-Developer Azure Data Explorer (Logical) Databases
Azure Data Explorer (ADX) is a Column Store database, which means it stores the data in columns instead of rows, which makes it extremely fast for queries on large datasets. Of course, there are some limitations when it comes to ingestion of data (to be batched) and updates/deletion of data (not easily possible). It powers all Azure logging/monitoring technologies (Azure Monitor, Azure Application Insights) and can be relatively inexpensive for a database (dev/test SKU around 100 EUR/month, and production SKUs starting from 500 EUR/month).
In the system in question the data is ingested to ADX via Event Hubs (no direct ETL from OLTP -> DWH!) in near real-time - by default in batches every 10 minutes, however for some tables also much shorter intervals can be configured.
The challenge here was how to isolate developers from each other when it comes to ADX.
The solution consists of 2 parts:
- Every developer gets her/his own logical database (in the same physical database cluster), with a database name having a suffix db4ab for example where "ab" are the initials of the developer. A script can re-create the database in a matter of a few minutes.
- Implement routing of events to an alternative database based by setting a "Database" property of every event upon publishing. The document for this built-in feature is here
Note: Microsoft offers an ADX emulator running as a docker container(https://learn.microsoft.com/en-us/azure/data-explorer/kusto-emulator-install), which is perfect in general, however there is no support for the data ingestion from the Azure Event Hubs which are running in the cloud.
Per-Developer Azure SignalR Notification Hubs
Azure SignalR service is WebSocket implementation which is used for realtime server <> client communication used in our cose for some realtime dashboards and UI flows.
Azure SignalR works with the named hubs to which clients connect by first requesting the hub url and access token, and then establishing a connection. So the solution in this case is to create a per-developer hub based on an environment variable which is set to a different value on every developer machine, but to the same value for the test/production environment. As the hub name/url/accessToken is fully controlled by the server app (k8s pod) there is no change required in the client application (SPA in our case).
Docker Containers running locally
In case there are high-quality local emulators available, then these are of course preferentially used, as then there is out-of-the-box isolation from other developers. We are running local docker containers for MongoDB and SFTP Server for example.
The install/run script can be as simple as a 1-liner:
docker run --name mongodb --restart unless-stopped -v mongodata:/data/db -d -p 27017:27017 mongo --oplogSize 50 --replSet rs0
or
docker run --name sftp --restart unless-stopped -p 22:22 -d atmoz/sftp foo:pass:::upload,upload/processed
Note the --restart unless-stopped
policy, which makes sure that upon restart of the dev machine the docker containers are started automatically.
What is important when running local docker containers are the scripts for setting them up, including all configurations to be applied - e.g. in case of MongoDB DDL scripts for the setup all local databases used by the apps.
mirrord
mirrord is an amazing and magic piece of technology, which allows you to debug a running container in a Kubernetes cluster - any cluster, including the local k3d cluster for example. More importantly, from developer UX point of view the debugging runs:
- within the context of the container, which means all environment variables for example with which the container was deployed(!), as well as inbound/outbound network connectivity
- within your own locally-running IDE - e.g. Jetbrains Rider(!)
What happens in reality: mirrord installs a pod in the Kubernetes cluster which redirects (or duplicates) all calls to the pod to the additional local process which it starts on the local dev machine.
Invoking mirrord can be done in 2 ways:
- Terminal:
- find the running pod name:
- start mirrord targetting the pod and at the same time your local app
```bash
mirrord exec --target pod/test-qryh-test-xyzh-68f79c7c7b-jdp4v
dotnet bin/Debug/net7.0/TestService.XyzHandling.dll
```
- Attach your IDE to the process
- Directly from the IDE (e.g. Jetbrains Rider) - run debug configuration + mirrord:
Once mirrord is enabled by clicking the round button to the left, you get a confirmation:
and can click on the Debug button for the app in question:
after which the app is started, and any set breakpoints will be hit:
The best thing is - the environment variables and network connections that are made from the application being debugged are coming from the Kubernetes cluster and going through the cluster to any other running pod or even outside application ... pure magic ;)
I need to mention also the great support of the mirrord creator I got for an issue I had with the Rider plugin. Not only did Aviram Hassan jump on a discord call with me on the same evening I reported the issue(wow!!), but he also shared his thoughts that with a bigger team size a shared development environment (in the cloud) becomes a necessity.
Some Non-Functional Considerations
- Scalability: The described setup has a limitation of 32 Azure Event Hubs partitions available, which means up to 30-31 developers if every developer gets 1 partition, or 15 if everybody gets 2 partitions. There is however no problem create additional Azure Event Hub Namespaces (~ Kafka clusters) which can support another 15 developers and so on. All the other cloud services used - Azure Storage Accounts, Azure Data Explorer, Azure SignalR etc. can scale out/up with the team size ... Never thought though what happens when the team is > 100 developers, but this is a non-issue in the context I am in currently ;)
- Network Connectivity/Offline usage: Offline usage is not supported, as there are cloud services involved. I am yet to experience a case where I have to developer/run/debug something without Internet connection though - 4G/5G network coverage is pretty good everywhere (incl. in public transport) nowadays.
- Docker containers vs multi-tenant/per-developer cloud services: as mentioned above, this is just a tradeoff, can go more to the one or the other direction. Emulators are not the real thing or may not be available, but implementing multi-tenancy for cloud services requires substantial changes in the software layer responsible for infrastructure, which may also not be that easy.
- mirrord on Windows/WSL: Even though everything in the above setup should and does work on Linux, MacOS and Windows, mirrord is yet to be tested whether it works well on Windows with WSL, especially the part with Jetbrains Rider connecting to the process to debug, or spawning it using mirrord plugin ..
Conclusion
A couple of approaches were combined to create a pretty easy to setup, convenient to use and cost-efficient development environment, allowing every developer to run/debug applications in full isolation:
- Local Kubernetes cluster with k3d
- Multi-tenant Azure cloud services based on some service sub-entity like partitions, logical databases or hubs
- Locally running emulators in docker containers
- Mirrord for pod debugging
Even though there was some effort setting this up, and even though the setup sounds a bit too specific and perhaps even fragile, it is doing what it is required to do, namely allowing developers to run and debug applications in isolation from each other.
Hopefully someone can save some time improving her/his development environment based on the ideas in this article!
Top comments (0)