Why?
Once our project has grown enough, you will ask: "How to reuse the code across functionalities, modules, and applications?"
The topic of how to make the code reusable we have discussed in the previous article. But what are the ways to distribute code?
Code distribution
There are different approaches to code distribution. They all have pros and cons, which we should evaluate before adopting a particular strategy.
Local code distribution
Let's start from the place we finished in the previous article
TL; TR, we have our code organized as a set of small composable code units, and the functionalities are maintained as a composition of those reusable units.
When we spot that we need some already written code to develop the new functionality, it is a perfect time to extract the code units into the shared place in your project. That place is usually a dedicated place containing all shared code. So the critical point is we can reference that code anywhere in the application. It seems like a simple, straightforward solution for such a task. And I fully agree with it. The only thing we all miss is the caveats of that solution hidden behind its simplicity. To mitigate those pitfalls, first and foremost, we need to be aware of them:
PROS:
- Straightforward
- Fast to adopt
- Improved code quality
-
No double work - reusability itself
- Features sharing
- Bug fixes are done once and then available for all code users
And the pros lead us to the other side of the coin - the cons of that approach.
CONS:
- Single point of failure
- Bug introduced once will get everywhere
- As a result, our code becomes more fragile
- That fragility increases maintenance costs
- Maintenance cost
- Additional abstractions
The fascinating thing here is that you never know if that behavior is a bug or if someone relies on it as a feature. As a result, you can fix the bug and break the other functionality that relies on that buggy behavior.
Fortunately, we can leverage many patterns, principles, and best practices to mitigate those limitations, e.g., SOLID, GRASP, GoF patterns, etc. And especially automotive testing of the functionality will help avoid such regressions. The neat bonus of the approach described in the previous article is that you have already mitigated those issues, applying principles and covering code with unit tests. And if initially those tests and principles guarded a single functionality, now they are guarding several of them. With the project's growth, they will protect even more. Looks good so far, despite the one thing. One day we'll reveal the hidden cons of that approach - we can't share the code between several applications. So we need either leverage the code duplication or search for other solutions.
Repository code distribution
The application we develop has grown, and we must decide how to distribute our code among other applications. We might pack the code into a deliverable package to achieve that. It is usually called a package. We already have all we need to do that - our shared source code placed in a dedicated directory in our application. So we need to pack it and publish it somewhere. Again and again, the initial simplicity hides the actual caveats of the approach. So let's take a glance at them and afterward discuss the details.
PROS:
- Code quality
- No double work - reusability itself
- Code is distributed across several apps
- Improved transparency
CONS:
- Single point of failure
- Additional abstractions
- Increased maintenance cost
- Additional complexity
- Harder to change a released public code
As expected, we inherited all pros and cons from the previous solution and added several new ones. So it is ok, and it was expected.
Let's take a look at the pros. Here we have slightly evolved characteristics of the code distributions. Now it can be used in several applications. But to make other developers use the library, you need to work on the library's transparency.
While the same developers create and use the library, it is not so crucial. Ideally, the development team is familiar with the library codebase and aware of the recent and future changes. And if developers introduce changes to the shared code, they migrate all users of that code along the way.
With the library development, such migration won't be possible or is not the desired approach. We need to concentrate on the library development, not the code migration (it is true if the code migration is not your business as well). So at that point, we get to the need for improved transparency. Some approaches could help library developers communicate with library users:
- code incapsulation to avoid exposing the package internals
- versioning
- changelogs
- migration guides
- migration utilities
- roadmaps
- documentation
- etc.
That transparency greatly benefits library developers, let alone library users. And unfortunately, all those benefits result in the increased maintenance cost, additional abstractions, and, as a result, increased complexity of the overall solution. And as a bonus, it is getting harder for library developers to change the behavior of the already available thing for the package users. As a result, breaking changes are becoming more costly for library developers and users.
Library transparency requires a defined process:
- CI/CD process
- versioning strategy
- release process
- documentation process
- testing process
- etc.
If you have those things already defined, it could ease the adoption. Transparency is a vital thing for publicly shared libraries, and it helps build trust. Trust is one of the factors that determine the library's success.
Trust is still essential for the private libraries used inside the organization, but it is less crucial. Mostly it is because you provide a unique product that can not be easily substituted. And you have some initial trust credit. Try not to overuse it because internal users of your package could duplicate desired functionality to avoid the changes your library could bring. So trust is vital in both cases. The difference is the initial trust credit you have.
In summary, I emphasize the increased complexity of the solution compared to the locally distributed code and encourage you to evaluate the benefits of such migration first.
Monorepo code distribution
Let's take a step back and discuss the intermediate solution that applies to some cases. If you only need to share the code between internal applications, there is another option - use monorepo to organize the applications and shared code. That will allow other teams to work on their apps with the shared code available and ease the migration
to repository distributed code and its further development. In addition, it will decrease the upfront development and organization cost of the adoption compared with repository distributed code.
With monorepo, you won't need to publish your libraries. Instead, you will be able to compile applications from the locally shared code. And evolutionary with the growth of the teams, you will need to gradually adopt the approaches that increase the transparency of the code. So eventually, publishing the libraries to the dedicated repository will be a matter of setting up CI/CD and a repository.
Distributed code as a service
We have adopted an approach of distributing code between several applications. And there is one limitation that I intentionally missed from the cons section - technologies limitation.
We created the library gradually through an evolution process from the sources of our initial application that resulted in technologies looking for our reusable code. We locked on our chosen programming language, paradigms, framework, and libraries. With frameworks and libraries, we can always refactor the code and remove/ extract the coupling to another layer of the libs. But it is typical when we need to share the functionality between applications written in different programming languages.
Here is the perfect spot for the other code distribution technique that can mitigate technological limitations - distributed code as a service.
We won't eradicate limitations because we still need to depend on some lower-level abstractions. (E.g., transport level abstractions, communication protocols, etc.) It won't be an issue for us until all our clients support them.
Distributed code as a service - is an approach that states the possibility of moving the shared code to the server and executing it there. There are plenty of different ways to implement it. For example, it could be a REST API with the endpoints that serve the dedicated functionalities (reusable composed flows). Or lambdas for our small pure code units. Or you can build an event-driven service based on the continuously opened WS connections, etc. You should carefully evaluate all pros and cons of each of those sub-approaches and select one that serves your application.
CAUTION. Don't try to chase the trends. If your back-end is organized as a monolith, there is a reason for that. And pursuing microservice trends by adding a new service for your shiny shared code could be a considerable overhead for your organization. That overhead would be in setting up either microservice infrastructure to manage all together or hosting it separately, which will spawn additional maintenance costs. On the other hand, if you see the benefits of the migration to microservices, it could be an excellent opportunity to initiate that process.
Let's review the strong and weak sides of that approach. All of them are related to sub-approaches as well. The only thing that sub-approaches could adjust is that generic list with their own strong and weak sides and mitigations for the base approach's issues.
PROS:
- Code quality
- No double work - reusability itself
-
Fast updates of the application
- Sometimes, you can update only server implementation without needing an application release to fix/change the functionality.
- Code is distributed across applications written with different technologies
- Shrinked the bundle size
- Improved transparency
Regarding pros, here we see the same picture as before. We have decreased the technologies limitations to the desired level. And as a neat bonus, we get the possibility of changing the application's behavior by changing just the server implementation. For example, it could be precious for applications with long releases review cycle that is out of your control (e.g., mobile and desktop applications with reviews in app stores).
CONS:
- Single point of failure
- Additional abstractions
- Increased maintenance cost
- Additional complexity
- Additional computational costs*
- Increased latency
- Additional context switching*
- Code is split between several places and potentially written using different technologies.
- Technologies limitations
- Bound to the selected protocol/ technologies
With each approach evolution, we are adding new cons faster than pros. However, it is expected because of the complexity of the application, and the overall solution is growing rapidly.
To the maintenance cost here, we added computational costs. Previously, we have deligated those costs to our clients as minimal hardware requirements, which were spread gradually among all users. So we can mitigate it by mixing the current approach with the repository distribution approach if we could execute the shared code both for a server and for some clients' applications on the client's side and others on the server side.
We will also have increased latency because of the added client-server communication latency. However, there are ways to mitigate it. For example, we could optimize the hot paths and bottlenecks, adopting code duplication with all its pros and cons.
For some clients' teams, we introduce additional context switching required to contribute to the distributed server code.
Technologies limitations are still with us, and it is ok until we hit their limits again. But we will work on solving the problem when we need to. Because, as you probably spot, limitations removal is a costly operation, and you should go for it only if there are clear and prevailing benefits for the project.
CAUTION. A considerable amount of projects drowned in complexity intended to solve the issues that were not even on the horizon. That is why I am advocating for an evolutionary approach - grow complexity as you need it and apply techniques that are beneficial for your application now. With project growth, their positive impact will grow exponentially and pay great interest.
That is not an exclusive list of possible approaches to code distributions. I would be happy if you write in the comments other approaches that can be highlighted or start the dispute about described ones.
Let's sum it up and highlight the main points:
- There is an excellent palette of available approaches to distribute the shared code. They have pros and cons. First and foremost, you need to know your application and use case to evaluate the approaches and select one that suits your case well enough, satisfying all requirements and leaving a space for further growth.
- Solving more advanced tasks usually results in increased complexity of the overall solution. Be aware and prepared for that increase. Make sure that the benefits cover the expenses.
- Follow the evolutionary approach. Don't try to solve problems that are not even on the horizon. Acknowledge them and leave a space for the solution.
Top comments (0)