DEV Community

James Eastham
James Eastham

Posted on • Originally published at jameseastham.co.uk

On Choosing Your Complexity

Software is complex. I don't think that's a controversial statement. Building, testing, observing, scaling, failing, evolving. And that's before you try to understand what it is your users want the system to do.

At times, it can be a bit demoralizing can't it? Constantly fighting against this ever changing, ever growing challenge of a system that is getting more and more complex. You think back to the days when you built a "hello world" application to learn a new concept and you wonder where all this complexity came from.

Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains. - Steve Jobs

Reading Steve Jobs' biography (which is an excellent read by the way) one of my biggest takeaways is simplicity. One of the reasons Apple became so successful is the relentless focus on simplicity.

Take the well-told story of Job's second coming at Apple. On his return, the product lines had grown massively. To the point Apple's own teams couldn't explain which Mac to pick. Internally, teams thought they were meeting a user needs when actually they were confusing customers with so many choices.

When Job's came back, he changed the entire product line to just have 4 products. Yep, just 4. 2 desktop computers and 2 portable devices. An obvious example of simplification, but it had the right effect.

When you're trying to change an entire industry, you are already dealing with lots of complexity. Changing human behaviors, introducing a new way of interacting with technology. This is essential complexity. Introducing 10x more product lines than you actually need, that is a form of accidental complexity.

What does this mean for you dear reader? Many of you will be working on new systems. New systems that will require a change in human behavior. Many of you are likely operating in a highly complex business domain where the problem space itself is highly complex.

Every single software system that exists has a certain level of essential complexity.

Take Plant Based Pizza, the backend for a pizza restaurant. Whilst not the most complex of domains, there is still a decent amount of complexity to think about. Taking orders, managing stock, managing staff, taking payments, refunding payments, preparing orders, organizing delivery drivers and getting the order delivered to the correct address.

And that's just considering the happy path. What if a payment fails? Or an order is ready but there aren't any delivery drivers available.

Think about the systems you are working on, they will inevitably have their own essential complexity.

Unfortunately for us, this complexity can't be removed. Much like Apple trying to change an entire industry, you can't get around this complexity. Trying to "simplify" this complexity can seem like a challenge.

Accidental complexity though, that's something you do have some agency over.

Accidental Complexity

Accidental complexity is the complexity that comes from the choices you make. The complexity that you (or your company) has control over.

Every technology choice you make will change the amount of accidental complexity of your system.

When Plant Based Pizza first started, they focused on delivering value quickly. At that time, there was a team of 5 building the backend. They built a monolithic application with a Postgres database to support all the different business use cases.

A single running process and a simple, flexible and well understood database technology == simplicity!

Of course, once they built the application they needed to run it somewhere. One of the team had been reading about Kubernetes recently and got the rest of the team incredibly excited about building a shiny new Kubernetes cluster.

They created a cluster, deployed some virtual machines instances as the underlying compute and used a managed database service for providing the Postgres functionality. Great, we are live.

This isn't the part of the story where I shit on Kubernetes. It's merely an example of the complexity you've just accidentally introduced into your system. Think about all the things you need to think about running a Kubernetes cluster:

  • Scaling nodes
  • Operating nodes
  • Securing nodes
  • Dealing with node failures
  • Upgrading the cluster
  • Securing the cluster
  • Dealing with the cluster failing
  • Scaling your application
  • Dealing with application failures

That's before you consider the various 3rd party tools you'll need to deploy inside the cluster and keep up to date.

Let's consider an alternative universe where the team made a different decision.

The team still packaged the monolithic backend application as a container image, and still used a Postgres database. They made that decision for simplicity (containers are well understood) and portability (containers can run anywhere, as can Postgres).

Instead of spinning up a Kubernetes cluster though, they used a managed container orchestrator. Something like Amazon ECS (with Fargate), Azure Container Apps or Google Cloud Run. They kept the Postgres database running in a managed cloud database service.

Let's look at the complexity you've just introduced:

  • Scaling your application
  • Dealing with application failures

A shorter list, right? Using a managed container orchestrator means you shift a whole bunch of that accidental complexity on to the cloud provider. Operating nodes? Operating clusters? Dealing with node failures? That's the cloud providers responsibility now.

There's a trade off here with cost of course, and I talked about that recently in this post on the factors of modern compute].

Continuing this tale, we fast forward 12 months. Plant Based Pizza has grown, and there are now 50 developers across 5 different teams. All teams are still contributing to that same monolithic code base, and are getting in each others way.

This is a form of accidental complexity! The complexity that comes from the decision to grow teams. Complexity you now need to deal with.

Microservices are a reaction to organization challenges. Microservices are a solution to a socio-technical problem. They aren't a solution to a technology problem.

So the Plant Based Pizza team decides to start to break their application down into microservices. As you might expect, microservices introduce their own complexity. And believe me, microservices introduce a lot of technology complexity.

At this point, there is no good reason to change the complexity of your underlying compute. You are already introducing enough additional complexity by choosing microservices, introducing another layer of infrastructure complexity probably won't help you too much.

Speeding forward in time again, Plant Based Pizza has now been refactored to microservices. These microservices are communicating with asynchronous events to make them as loosely coupled as possible. And you're still running the applications (packaged as containers) on serverless compute. You still don't have a good reason to introduce the complexity of Kubernetes.

What accidental complexity does the system have?

  • Scaling the applications
  • Dealing with application failures
  • Eventual consistency across different microservices
  • Service to service communication
  • Observability (it's hard in asynchronous systems)
  • Governance and evolving the system safely

Slightly more complexity, but still not crazy amounts right? And all this complexity is something you can have direct control over as a developer.

If you choose to run your application on Kubernetes you still have this complexity to deal with, as well as all the complexities that come from Kubernetes.

Again, this isn't a post written to bash Kubernetes. It has a place in the world. But you probably don't need it for your applications.

And if you are building with Kubernetes, you can still apply this lens of accidental complexity. Does every single developer inside your company need to understand Kubernetes? Do they need to understand Helm charts and YAML files. Or can you build your Kubernetes cluster in a way that means developers only care about shipping container images. This is, of course, platform engineering.

When I was at AWS I worked with a company (that I can't name) doing this. They were building an abstraction layer on top of Kubernetes so that developers could just ship container images and not worry about anything else. To add complexity, this abstraction layer also spanned multiple cloud providers.

Whilst it was an interesting project to work on, it wasn't for the business. They had put an insane amount of money, developer time and operational responsibility into building a service that wasn't even nearly as feature complete as Amazon ECS or Azure Container Apps. They even had a side by side comparison chart and the custom built service wasn't anywhere near.

All of that, to deploy business applications... To put it bluntly what a massive waste of time and effort.

You need Kubernetes if your ability to dynamically scale and manage infrastructure is a core differentiator for your business.

The takeaway from this rambling post is to think about the decisions you make and the complexity that introduces.

Focus on building stateless, portable, containerized applications. Focus on structuring your application code to separate infrastructure concerns from business logic (ports and adapters style). Focus on shipping those applications in an operationally light way as possible.

Reduce your operational/infrastructure complexity by adopting managed platforms and shifting that responsibility onto the cloud provider. Cost is of course a consideration here. The line item cost of running Fargate is more than a comparable EC2 instance. But that doesn't take into account the human cost required to run EC2.

You can see this line of thinking in action with the fact all the cloud providers now have some form of 'Kubernetes auto'. Where you can run Kubernetes, and then pay your cloud provider extra to take a chunk of the operational overhead.

If you focus on stateless, portable containers this compute conversation becomes largely irrelevant. Run your application in a place that makes sense right now, and you are ready to deal with future changes.

As a thought exercise, think about the accidental complexity you have in your system today? What complexity are you dealing with that isn't related to the business problem you are trying to solve. And importantly, what could you do to reduce it?

Top comments (0)