DEV Community

Cover image for Cleaning up your codebase with a clean architecture
Barry O Sullivan
Barry O Sullivan

Posted on

Cleaning up your codebase with a clean architecture

Let's talk software architecture. Most of us know MVC, it's the foundation for pretty much every web framework. As your product grows though, problems with MVC will start to appear. Even a relatively simple product can end up with a bloated and messy codebase. MVC is where we start, but what do you do when you need to evolve past it?

Before we go further, let's examine why we have so much trouble explaining the answer.

Here's a common conversation (for developers anyway)

devA: "Our codebase is really messy, how do we clean it up?"
devB: "We need to refactor it, move code into objects, separate the concerns."
devA: "Ok, great. How do we do that?"
devB: "We'll use design patterns and follow the SOLID principles."

This is where we normally stop, we seem to think that's a good enough answer for someone to get started. It's like you asked a carpenter how to make a table, and he just points at his tools and says "use those". The answer is technically correct, but it doesn't tell the whole story and it definitely isn't useful to someone learning to write software (or make tables).* The tools are important, and you need to know what they are, but they're only a small part of the process.

Learning the patterns isn't enough

This is the part that's the most infuriating, learning the patterns isn't enough. Infact, you're probably in a worse place immediately after learning them, because suddenly you have power tools and no clue where to use then.

Learning the patterns opens up a whole new range of questions you have to answer before you can use them effectively.

  • Where should I use design patterns?
  • How do I decide which one to use?
  • Where do I draw my lines of abstraction?
  • When should I use interfaces?

Without some guiding principles on how to answer these questions, developers end up creating random boundaries between objects, using patterns wherever they can. This leads to inconsistent code that's worse than it was without all this "design".

Here is a joke example of this very real problem

No wonder people complain about "design" and "patterns", we don't do a great job explaining how to use them effectively.

Where to start

For me, the moment it all clicked was when I learned about clean architectures**.

A clean architecture is a guide to splitting up your code, giving direction on where certain concepts belong. There are many flavours but they each share the same core concepts.

  • Separate your code by layers
  • Put abstractions in the inner layers
  • Put implementation details in the outer layers
  • Dependencies can only point inward, never outward.

Here's a fantastic article on the concept that get's it across far better than I ever could.

https://8thlight.com/blog/uncle-bob/2012/08/13/the-clean-architecture.html

Pretty cool right? If you've never seen it before, it's eye opening.

It gives guidelines on how to separate your concerns and where to draw your boundaries. Once the boundaries are drawn, the patterns become obvious. It's great tool for separating the wheat from the chaff.

Let's look at an example to see what I mean.

An example

Here we have a simple usecase for setting a user's profile image. This class is used internally by both a controller and a console command, so the code is pretty condensed.

<?php
namespace App\Usecases;

use Ramsey\Uuid\Uuid;
use SplFileInfo;
use Aws\S3\S3Client;
use League\Flysystem\AwsS3v3\AwsS3Adapter;
use League\Flysystem\Filesystem;
use Domain\User;

class SetProfileImage 
{
    private $filesystem;

    public function __construct()
    {
        $client = S3Client::factory([
                'credentials' => [
                    'key' => getenv('AWS_ACCESS_KEY'),
                    'secret' => getenv('AWS_SECRET'),
                ],
                'region' => getenv('AWS_REGION'),
                'version' => 'latest',
        ]);

        $adapter = new AwsS3Adapter($client, 'bucket-o-images');

        $this->filesystem = new Filesystem($adapter);
    }

    public function handle(Uuid $user_id, SplFileInfo $image): Uuid
    {
        $image_id = Uuid::uuid4();

        $filepath = "profle_image/$image_id.".$file->getExtension()
        $image_contents = $image->fread($image->getSize());
        $this->filesystem->write($filepath, $image_contents);

        $user = User::find($user_id->toString());
        $user->setProfileImage($image_id);
        User::where('id', $user_id->toString())->update($user->toArray());

        return $image_id;
    }
}

What's wrong with the above? Well, despite it's condensed nature, it actually has five concepts intermingled, muddying the water.

  • Our Application
  • Config
  • AWS S3
  • Flysystem
  • Eloquent ORM

You have to understand each of those concepts in order to understand what the class is trying to do internally. That's a surprising numbers of concepts you need to keep in your head in order to understand such a simple class.

To make matters worse, they're not consistent. They each describe their solution in their own language, using details and concepts that do not mix well. For example, ORM language such as find, where and update don't exist in the Flysystem language. Throw application concepts into the mix and you can see why things can get confusing.

It's like trying to read a book where it switches between French, English and Russian at random points, sometimes from word to word. Sure, eventually you could read it, but you'd make a lot of mistakes along the way and you'd be a frustrated mess at the end. So let's try to clean things up.

Dividing the layers

First we need to know what we're removing from the application layer, so let's use language as our guide.

Since we want the language to be application focussed, we need to remove any words unique to the following concepts.

  • Config
  • Flysystem
  • AWS S3
  • Eloquent ORM

Instead, we'll replace them with integration points that use the language of the application, rather than the implementation.

This is where design patterns come in. By looking at what you're trying to do and not how you're trying to do it, you can spot common patterns emerging.

Looking at the code, image and user are core/domain concepts respectively, but some of the code that uses them is pure implementation. These are our integration points, so let's drill down into what we're really doing with "images" and "users".

In our example it's clear that we really want to do two things

  • store and retrieve users
  • store images

Everything else is an implementation detail. Storing and retrieving things is clearly the repository pattern. So let's create two new application concepts, UserRepository and ImageRepository, and we'll implement them as interfaces.

Application level

<?php

namespace App\Usecases;

use Ramsey\Uuid\Uuid;
use SplFileInfo;

class SetProfileImage 
{
    private $image_repo;
    private $user_repo;

    public function __construct(ImageRepo $image_repo, UserRepository $user_repo)
    {
        $this->image_repo = $image_repo;
        $this->user_repo = $user_repo;
    }

    public function handle(Uuid $user_id, SplFileInfo $image): Uuid
    {
        $image_id = Uuid::uuid4();

        $this->image_repo->store($image_id, $image);

        $user = $this->user_repo->get($user_id);

        $user->setProfileImage($image_id);

        $user_repository->store($user);

        return $image_id;
    }
}

namespace App\Services;

use Ramsey\Uuid\Uuid;
use SplFileInfo;

interface ImageRepository 
{
    public function store(Uuid $image_id, SplFileInfo $image);
}

namespace App\Services;

use Uuid;
use Domain\User;

interface UserRepostiory
{
    public function get(Uuid $user_id): User

    public function store($user);
}

That's the application level restructured. We've refined the language and concepts down to the bare minimum that the application needs, expressing our intent. We also created two new concepts that were hidden in the old implementation and we've boiled them down a simple pattern, the repository pattern.

As an aside, we have no problem adding design pattern language to the application layer, because it's a shared language between developers, it aids clarity rather than obscuring it.

Now let's look at the implementations.

Infrastructure

<?php
namespace Infrastructure\App\Services;

use App\Services\ImageRepository;
use Ramsey\Uuid\Uuid;
use Aws\S3\S3Client;
use League\Flysystem\AwsS3v3\AwsS3Adapter;
use League\Flysystem\Filesystem;

class S3ImageRepository implements ImageRepository
{
    private $filesystem;

    public function __construct()
    {
        $client = S3Client::factory([
                'credentials' => [
                    'key' => getenv('AWS_ACCESS_KEY'),
                    'secret' => getenv('AWS_SECRET'),
                ],
                'region' => getenv('AWS_REGION'),
                'version' => 'latest',
        ]);

        $adapter = new AwsS3Adapter($client, 'bucket-o-images');

        $this->filesystem = new Filesystem($adapter);
    }

    public function store(Uuid $image_id, SplFileInfo $image)
    {
        $filepath = "profle_image/$image_id.".$file->getExtension()
        $image_contents = $image->fread($image->getSize());
        $this->filesystem->write($filepath, $image_contents);
    }
}

namespace Infrastructure\App\Services;

use App\Services\UserRepostiory;
use Ramsey\Uuid\Uuid;
use Domain\User;

class EloquentUserRepostiory implements UserRepostiory
{
    public function get(Uuid $user_id):User 
    {
        return User::find($user_id->toString());
    }

    public function store($user)
    {
        User::where('id', $user_id->toString())->update($user->toArray());
    }
}

And that's that, we've separated our usecase's code across two layers.

Why is this better?

Easier to read

From a language perspective, this is much cleaner. It's focussed on the language of what our application wants to do, not how it intends to do it. This cohesive language lowers the barrier to comprehension. In short, there are fewer concepts to parse in order to understand it.

Testing

It's far easier to test. The original has so much going on that testing it is difficult. Our acceptance tests would have to connect to S3 and the DB to prove that the code worked. This would couple our tests to our implementations, making the code brittle and expensive to change.

In the new version, each layer can be tested in isolation, with unit tests for our usecases and integration tests for our implementations. This makes our acceptance tests much easier to verify, we only need to check the output, rather than the side effects.

We're using patterns effectively

In the above, we used the repository pattern in a way that made sense to the application, rather than using whatever pattern we liked at the time. This keeps things focussed and stops us going design pattern mad. After failing with design patterns for so long, it feels really nice to use them effectively.

Loose coupling

The above is actually a SOLID*** example of the dependency inversion principle in action. We've inverted the dependencies of our code, ie. the application is no longer dependant on the database, instead it is the database that depends on the application. Simply put, this means that a change in the database will no longer affects the business logic, so there's less risk in changing it.

Changing implementation

Another benefit, switching implementations is incredibly easy. Imagine we needed to switch storage mechanism, say from MySQL to MongoDB. Instead of gutting our usecase, we just create a new implementation of that interface with the MongoDB lib under the hood. Boom, we're done. No changes required at the application level.

This example highlights how useful language can be as a guiding hand when writing layered code. The newly introduced language of mongodb did not bleed into the language of the application layer, it stayed the same, no change was necessary. This is a sure sign that we've separated our layers correctly.

Why is this worse?

Let's be honest, it's not all sunshine and roses, so let's address the const ELEPHANT = true; in the class Room;
For a start the codebase is larger, there are now more files and there is more code to read. It also takes longer to write. You have to spend time extracting the code into layers, taking the time to think about the design of your code, so there is an element of sacrifice there. This could get in the way if it's throwaway software that you just want to get out the door and never see again****.

I don't think the above are reasons enough not to do it though. Software changes, especially if it's core to a product, so it's worth the extra time now to save a lot of time later.

Plus, you get better at it the more you do it. Eventually you'll notice the same patterns again and again, and you'll be able to create the abstractions at the beginning, intuitively separating what you want to do from the tools you're using to do it.

Can we go further

Looking at the above, we're using a Uuid lib pretty much everywhere. We could abstract it of the code, using an interface instead, so we'd have an ID interface that uses the Uuid lib internally.

The reason we haven't is simple, it's not worth the effort. The pay-off in clarity would be too small. We use Uuids at pretty much every layer of the codebase, and we've found the overhead of interpreting the extra language of the Uuid lib to be pretty small. So we've simply made it a part of our core***** layer, instead of considering it an implementation details. This means we can use it any other layer directly.

When to do it

The above raises an important point, this approach to separation of concerns is a technique to use when needed, sometime you don't need it. Above we made a judgement call on what to extract into it's own layer and what to make part of the core. There are other guidelines, but really it boils down to clarity.

If you find things getting confusing in a layer, consider moving code/concept into another layer, or creating the layer if needs be.

Writing clean code

Writing clean code is hard, and using patterns haphazardly just makes things worse. That's why I recommend applying the clean architecture mindset, it's a simple technique for cleaning up your code. It helped me, and I hope it can help you.

*I think this is because we put all our focus on the tools, and not on the process. Everyone is so busy arguing over which chisel is best, they start thinking that the tools are all there is to designing software.

**Also know as Hexagonal Architecture or Onion Architecture. I was introduced to the concept by Vaughan Vernon in his book, Implement Domain Driven Design and I have never looked back

***Worst pun ever

****The only software I've never revisited is the kind that failed to produce value. If it gets used, it will change.

*****Quick suggestion, "Core" should only include code that's standalone, it shouldn't include any code that connects to outside services or has side effects, like database code.

Top comments (17)

Collapse
 
malinakirn profile image
malina-kirn

In this example, the database does not depend on the application -- and it would be smelly code if it did. Rather, you've abstracted the database into an interface and now inject the appropriate concrete implementation into the application, instead of constructing the concrete implementation in the application. Dependency inversion is almost always more appropriately phrased as dependency injection -- for exactly the reasons in this post (you've injected, but not inverted, which is exactly the right thing to do here).

I note that this example is also a great demonstration of the single responsibility principle, which you explain in the text (what if you want to change to a different database implementation?).

Collapse
 
courier10pt profile image
Bob van Hoove

That's a great example, complex enough to make the point, small enough to keep me engaged :)

If you compare the 2 versions of the SetProfileImage classes the second one is almost self explanatory.

Collapse
 
kjshah profile image
kjshah

Thanks! I don't see where you indicate to use the s3 implementation of the interface, but I imagine that happens typically in laravels service provider? How would you do that if the implementation choice (s3 vs g cloud) had to come from a database value?

Collapse
 
barryosull profile image
Barry O Sullivan

That's exactly how I do it, use Laravel's service provider to bind the interface to the implementation.

As to your question, if you needed to choose the implementation based on a config detail (database value in your example), you can do that in the service provider, binding the appropriate implementation choice based on the value.

You could also make an implementation of the interface that chooses which other implementation to use based on the database value, sort of like the proxy pattern, proxying the calls through to the appropriate concrete implementation.

Collapse
 
kjshah profile image
kjshah

ok, thanks, that makes a lot of sense! I always had trouble thinking through that second part of having configuration come from database values.

Collapse
 
pht profile image
Tuan Pham

Thanks for interesting article!

A minor mistake in class SetProfileImage:

class SetProfileImage 
{
    private $image_store; // Is this $image_repo?

Also, I have another question, in class EloquentUserRepostiory you are using Domain\User and function get return Domain\User, is it an Eloquent model?
If it is an Eloquent model, how do you implement the repository with MongoDB, which isn't supported by Eloquent?
Maybe, we should use Entity instead of Eloquent Model?

Collapse
 
barryosull profile image
Barry O Sullivan

Hi Tuam,

Thanks for pointing that out, I'll correct it shortly.

You are correct, the above get call would return an eloquent model, which isn't ideal. If I were to take this code example further, I would create a domain class for User that is completely independent from Eloquent, and then translate from it to the Eloquent model inside the repository.

Collapse
 
j7mbo profile image
James Mallison

This is a nice example of SoC, yet also yields to an anemic domain model when there exists a service layer operating on entities - the changes to which are not encapsulated within the entity (read: aggregate root in the case of a user changing their profile image here). ADMs are okay for CRUD, just as a rich domain model also has it's trade-offs, but I feel it's important to make the distinction clear. Purporting to use DDD yet only focusing on placing code in different directories is not DDD, it's a directory structure.

Collapse
 
barryosull profile image
Barry O Sullivan

Hi James,

I agree, the above isn't a solid example of DDD, and it should not be viewed as such.
The above code is a contrived example to explain the concept of a clean architecture, so it doesn't feature any of the richness that a real domain model would contain.

Collapse
 
davidszabo97 profile image
Dávid Szabó

There is a typo in one of the snippet: $this->image_reoo = image_repo; -> $this->image_repo = $image_repo;

Thank you for the article!

Collapse
 
barryosull profile image
Barry O Sullivan

Good spot, thanks for pointing that out. I've updated the code snippet.

Collapse
 
joeizang profile image
Joe Izang • Edited

Devs used to say this was microsoft and java concern (ddd) etc. But this is how to do it. Enjoyable piece

Collapse
 
buzzdan profile image
Dan Mordechay

How would you apply clean architecture with functional programming style ?
Is it possible ?

Thanks,
Dan

Collapse
 
barryosull profile image
Barry O Sullivan

Hi Dan,
I think it's entirely possible. Functional programming doesn't have objects, but you can still compose functions in the same way you compose objects. This means you can build high level functions (domain/usecase layers), that internally use lower layer functions that interact with infrastructure, in the same way that you'd inject services in the article above.

Collapse
 
tripper54 profile image
Phil Dodd

Excellent article, very succinct! I will share this with my team.

Collapse
 
elcotu profile image
Daniel Coturel

Good article