DEV Community

Shredded Mustard
Shredded Mustard

Posted on • Edited on

CQRS

Distributed systems have long been evolving. Most distributed systems use the Database per Service concept, which means that each service has its own database that it is fully responsible for, and only the service can access this database to perform CRUD operations. CRUD stands for Create, Read, Update, and Delete. In most modern applications, there is a Data Access component, which acts as a contract between the application and the database. It is responsible for modeling the database and managing querying, inserting, updating, and deleting operations, all encapsulated within the CRUD Data Access Component.

If our application needs to fetch some data from the database, for example, it communicates with the Data Access component to retrieve the required data. Most read operations may be straightforward, like "find entity by ID" or "find by Username." However, when dealing with write operations, things are often not as simple. Before performing write operations, a series of checks and validations may be necessary. For example, if we have a Tickets database and want to purchase a new ticket for User "X," we might first validate that the user meets the age requirement [findUserAgeByID()], then check ticket availability [countTicketsByEventId()], and only then proceed with updating the ticket status to "Pending." After that, we credit the User's bank account with the ticket's value and finally change the Ticket status to "Acquired." This chain of operations is unnecessary in a read flow, where, for example, the user simply wants to view their purchased tickets. In that case, retrieving data is as straightforward as [findTicketsByUserId()]. This highlights that most application logic complexity lies in write operations, while read operations remain relatively simple.

Another consideration with read operations is the frequent need for joins. Join operations are essential in relational databases. However, when applying the Database per Service principle, most join operations are eliminated, requiring communication with the responsible service to fetch the necessary data. For example, if an event organizer wants to view a list of each ticket along with the corresponding phone number of the user who purchased it, the Tickets Service would first fetch all tickets for the event. Then, for all associated users, we would need to communicate with the Users Service to retrieve their phone numbers. Although we could add a PHONE_NUMBER field to the TICKET table, such a design becomes cumbersome as the application evolves. For instance, if we want to add a feature where the vendor also sees users' emails, the database design grows increasingly complex, making it challenging to extend our service. If you've worked extensively with microservices, you may recognize this as a common problem. The simplest solution in practice is to accept the latency and fetch each user’s details separately or in batches by passing a list of IDs.

Simple Write flow
Image description

Simple Read flow
Image description

To conclude, write operations are significantly more complex than read operations, and each microservice typically contains far more write logic than read logic. The CQRS pattern can be beneficial in addressing this complexity.

CQRS pattern

CQRS stands for Command Query Responsibility Segregation. As the name suggests, this pattern segregates commands and queries. Commands are for insertion, update, and deletion operations, while queries are purely for reads and joins. Here’s how it works: each microservice handles write logic within its own database, as in traditional distributed systems. However, we introduce a separate service solely for querying, which has a different data source and contains groups of relevant resources often joined to construct a single entity. The Query service's database can differ completely from the Command database. The Query service is responsible only for fetching data. When the command service updates its database, it publishes an event containing the new or updated model. The Query service consumes these events and updates its database accordingly. When data needs to be fetched, the application queries the Query service, which can efficiently retrieve data and perform joins as needed.

Image description

This model offers numerous advantages:

  1. Optimized Datastore
    We can select different database technologies for Command and Query databases. For example, if we find that a NoSQL database is more suitable for fetching data, we’ll use a NoSQL database for the Query service. Conversely, if a well-structured, constraint-oriented database is more appropriate for the Command service, we can use a relational database like MySQL or Oracle.

  2. Better scalability
    Consider a Posts service in a social media platform like Instagram. The read-to-write load ratio is enormous. A typical user might create a new post once a week, month, or even year, but could view dozens, hundreds, or even thousands of posts daily. It doesn’t make sense to scale a service that handles both reading and writing equally. With CQRS, we can scale each microservice according to its load, leading to higher performance and reduced costs.

  3. Single Responsibility
    Each service’s responsibility is clearly defined and distinct from others. A good engineer will know exactly where to make changes when a new requirement arises.

CQRS in Practice

In this example, we delegate all Command responsibilities to the Tickets Service and Users Service, as before, but we introduce a third service dedicated to querying data. We’ll use a MySQL database for the Tickets and Users services and MongoDB for the Query service
Image description

Now, whenever a user purchases tickets, the Query database will have all the necessary information.
Image description

Trade-offs with CQRS

There is a trade-off, of course. Unlike traditional CRUD operations, CQRS provides eventual consistency rather than immediate consistency, meaning that data will eventually be consistent across services but not necessarily right away. If you cannot accommodate this delay, CQRS may not be suitable for your application.

We have learned how the CQRS pattern works and when to use it. Next, we will explore Event Sourcing, which often goes hand in hand with CQRS.

Top comments (0)