DEV Community

Ujjwal Raj
Ujjwal Raj

Posted on • Edited on

Building Resilient Applications: Insights into Scalability and Distributed Systems

Scaling Up vs. Scaling Out

When you build and deploy a web application, you probably do it on a cloud service like AWS or Azure. You configure the system to maintain scalability. Scalability refers to the capacity of the system to serve its users or clients without affecting performance or quality. I remember checking my JEE (mains) results and often saw their site crash at the exact time of their result announcement. They didn't scale their system to withstand so many concurrent requests from lakhs of students; they failed to scale their system.

There are two technical terms: throughput and response time. The number of requests processed per second by the system is called throughput, while the time delay between the request made to the system and the response received is referred to as response time. The following figure shows the typical nature of such behavior.

Image description

Until the capacity of the JEE results website was reached, it was able to serve throughput to its users. At the time of the results, the client requests per second rose, exceeding the capacity, which may have led to system faults and failure. Perhaps their database couldn't handle so many read requests, or maybe the network traffic of their frontend deployment was full, causing significant latency. Essentially, their throughput decreased, ultimately reaching zero. Boom! The site crashed.

One way that NTA (the organization that conducts JEE) could have managed this is by scaling their system to handle 10+ lakh requests from students across the nation—e.g., adding more memory (RAM) to their system or increasing the database specifications. This is known as scaling up or vertical scaling, but it has limitations. You cannot have a supercomputer serving your users.

Now, scaling out (or horizontal scaling) comes into the picture. NTA could have deployed several different instances for different regions in the country. North Indian students could see their JEE scores from a server deployed in Delhi, while students from the South could see theirs from a server deployed in Mumbai, and so on. Thanks to cloud providers like AWS, Google Cloud, Azure, etc., it is now easy to scale horizontally.

In general, the scalability of a system is parameterized by availability.

Image description

Availability is mostly measured as figures of nines.

Image description

My current job involves software serving 40+ million users, with an availability of five nines.

Communication and Coordination in a Distributed System

When you scale out a system, you likely adopt distributed system designs. You may have multiple instances serving different regions. You may also have distributed databases for chunks of users. Let’s assume a successful, non-crashing NTA site. They must ensure that there is proper coordination between their distributed systems. To coordinate, their systems communicate among themselves via inter-process communication (the most common method being HTTP). There may be a common database containing the scores of all students. All different deployments must access the common database to serve their respective users. The user connects with the backend via the frontend, the backend communicates with the database via HTTP protocols, and the response is sent back to the user.

What Exactly is a Distributed System?

A distributed system is essentially a collection of loosely coupled software that connects among themselves or with other systems via IPC connections. To facilitate these connections and coordination, the systems in the distributed web expose interfaces. These interfaces typically define the business logic that the system (or server) is designed to execute. We will see an example that will clarify this.

Adapters

The adapters in a server uses or implements the interface. There are two major types. Inbound adapters are ports like APIs that are invoked by other systems to trigger the business logic the current system is built to execute.

There may be a need to use external server business logic execution, which is handled by outbound adapters (e.g., any data store service). These concepts will become clearer as we proceed with an upcoming example.

I participated in a hackathon in my second year of college, where the problem statement required us to build a food delivery application using microservices architecture. We divided the system into three microservices: cart, auth, and shop. For the cart service, we created an inbound adapter with a REST API, /addItem, to handle requests for adding items to the cart. This API transformed the incoming data into the appropriate domain model and sent it to the service’s core logic. For the auth service, we built an outbound adapter to communicate with an external authentication provider using OAuth2.0, where the adapter made HTTP requests to verify user credentials and tokens. The shop service used another outbound adapter to interact with external restaurant APIs, retrieving menu items and placing orders. At the time, I didn’t realize we were implementing inbound and outbound adapters—technical terms related to distributed systems—but now I recognize that we were using these concepts while designing the architecture, even without explicitly knowing their names. We were awarded silver in the hackathon.

External Plugins to the Service Business Logic

The following figure illustrates a clean architecture for building a system or service in a distributed environment.

Image description

The core business logic is exposed through interfaces to the HTTP adapters, while other layers, such as the database or cache, point to the business logic via interfaces. Let’s assume the same architecture we built during the hackathon, although we weren't very meticulous while constructing it.

Image description

Let’s focus only on the Cart service. We built a business logic interface CartI, and the Cart class (which is our core business logic) implements that interface. We exposed the cart functionalities to other systems in our distributed web via the HTTP APIs or adapters. Since we needed the cart data to be synchronized with user data, we often needed to read from and write to a database. We used a dbI, which is implemented by the db class. The db class serves as a plugin pointing to the business logic in terms of clean architecture. Following this design maintains isolation. The business logic should not be aware of technical details; only the technical details should be aware of the business logic. For example, in our design, the person who wrote the Cart class did not need to know that we were using MongoDB; they simply used the dbI. Similarly, they didn't need to know that we were using FastAPI as our API framework; the person who wrote the HTTP adapters just used the business logic interface CartI and didn't have to worry about the implementations.

The solid arrows in the diagram indicate usage, while the hollow arrows show implementation.

The dotted box represents the core business logic to which all plugins are pointing.

Conclusion

In the next piece, I will either start exploring something new—perhaps a deeper dive into distributed systems or a different area of cloud architecture—or I might pick up a particular concept from this discussion, like the nuances of adapters or scaling strategies, and break it down further. I aim to keep things interesting by maintaining a balance between introducing fresh ideas and providing detailed insights into core concepts. Stay tuned!

Top comments (0)