Imagine data flowing continuously from one source to multiple destinations, passing through various stages of processing.
This is how we can visualize many modern applications. In building scalable systems, data management is central—along with preserving the history of that data. In traditional systems, when we update an entity's state in a database, the previous state is overwritten, and we lose any record of its past.
Consider a banking service that records a user’s account balance. Without logging each transaction, we wouldn’t know how the current balance was reached. When users review their balance, they expect to see a history of transactions that led to the current amount. Without this history, there's no clear way to explain how they arrived at their present balance.
Another example is a shipping service that allows users to track parcels. Once a package is shipped, tracking its journey is crucial for both user experience and problem-solving if the package gets lost. By tracing its path, we can identify exactly where things went wrong.
An event represents a fact, action, or change.
In this approach, each event marks a state change in the application. Storing these events enables us to capture the state of the application at any given time. By replaying these events, we can even restore the application to a previous state, almost like rewinding through a series of snapshots.
To understand event sourcing, consider breaking down an application into domains—distinct business areas or groupings of related processes. When a domain’s state changes, it emits an event to signal this update.
For example, an Account domain might publish an [Account Created] event when a new account is added for a user. This event is saved in an append-only log, and other domains can interpret it as they see fit.
By storing each event, we can replay them at any time to rebuild our application state. This replay feature is invaluable. Say we introduce a new service interested in past events—it can simply replay the event log to establish its state. For instance, if only the account service exists initially and a new User Service is added later, this service can replay the event log to create an up-to-date list of accounts linked to each user.
This also simplifies migrating or replatforming applications. If we decide to rebuild the application with a new technology, we only need to replay the event log to restore the data without worrying about database migrations or state loss.
Picture it like this: a user logs into a banking app, registers an account, and opens a new debit account. Two primary domains are involved here—the User domain and the Account domain. As the user is created and stored, the User domain publishes a [UserCreatedEvent], while the Account domain follows with an [AccountCreatedEvent].
Many other interconnected domains may find these events useful. For example, if a fraud detection feature is added to the Security domain, it can process the entire event log to analyze user behavior and account activity, identifying potential fraudulent actions. In scenarios like this, retaining event history proves immensely valuable.
Event Sourcing vs CRUD
In typical CRUD systems, updating an entity overwrites previous states without keeping a historical record. Event sourcing addresses this by storing each event in an append-only log, preserving the complete change history.
However, CRUD systems generally ensure strong consistency, meaning all parts of the system see updated data immediately. In contrast, event sourcing provides eventual consistency. This means that if your application requires strong consistency across multiple domains in real time, event sourcing may not be the best fit.
Event sourcing and CQRS
Event sourcing and CQRS (Command Query Responsibility Segregation) work well together. By separating read and write operations, the write services can be considered the source of truth and scaled as needed. The query services only read the event log to update their state without directly changing the application's state.
Benefits
Audibility and traceability
Event sourcing allows seamless auditing across domains. Instead of implementing auditing logic in every service, an Audit service can be created to consume events directly from the event log. Additional information, like trace IDs or timestamps, can be added to event messages to enhance traceability with minimal effort.Scalability
Combining event sourcing with advanced storage technologies and partitioning techniques makes it easier to scale and test the application, especially in high-demand environments.
Trade-offs
Eventual consistency
As mentioned, event sourcing only provides eventual consistency. This may not suit applications needing immediate data consistency, like stock trading platforms.Storage concerns
Events can accumulate rapidly, leading to significant storage demands. Careful management of event storage and strategies like snapshotting may be required to handle data efficiently.Complexity in querying
While simple events are easy to store, complex queries across large event logs can be challenging.
Top comments (0)