This article was also presented as a talk at the recently concluded APAC Community Summit 2022 in Bangkok, Thailand
Last year, our team embraced microservices architecture. Instead of one big monolithic application, we developed a dozen small microservices. This setup allowed us to push out features faster and keep each microservice simple and maintainable.
As our application grew, our events started to have actions that spanned different services. It became a distributed nightmare just to mark an order as delivered. We had to make sure actions in the payments, referral, points, and notification services were executed successfully.
In this post, I'll introduce EventBridge and how it simplified the coordinating of multiple microservices.
Monolith Apps get slower over time
Let's start by discussing how we would typically do this in a traditional monolithic application:
A monolith responds to this event by executing the series of actions sequentially. As an example, an "order_delivered" event may have the following actions:
- Mark the order as "delivered"
- Collect Payment
- Send the "your order has been delivered" email
- Award the rewards points
- Check the order if it had a referral code if it did award additional points
The monolith does this synchronously and makes the user wait until all the steps are finished. As we add more actions to this event, the waiting time gets longer.
Because monoliths tightly couples modules together, a simple change in one module becomes more complex than it should be. Unintended consequences in a seemingly unrelated module pop up occasionally. A full regression test is usually run to ensure against this.
Microservices are smaller, simpler, and easier to make changes to
For these reasons, we decided to move to an event-driven serverless architecture. Instead of one big application housing all modules, we broke down our application into a dozen microservices, each in charge of a specific group of functionality:
For the order_delivered event, its actions are owned by 5 different microservices. In our team, we assign 2-3 microservices to each developer. As system owners, they create features and troubleshoot bugs.
Since the actions required by the "order_delivered" event is delegated across five microservices, fulfilling this event requires us to make the microservices communicate with one another.
The simplest way to connect them is via API. But that would end up being even slower than the monolithic approach because of the latency caused by doing five API calls.
The shortcomings of using SQS to decouple the architecture
The better way would be for the order service to execute its component actions asynchronously. First, it receives the request. Then, it sends a task to the SQS queue of the payment service. On the receiving end, the payment service will get the task on the SQS queue and run the action to collect the payment. The order service then sends a task to the SQS queues of the points, referral, and notification services. Each service processes the action by getting the task off the queue. Then, it returns a response to the customer.
In this process, the response was sent only when all 4 actions have been queued in their respective SQS queues - not when all 4 actions have succeeded. That is asynchronous processing in action.
SQS queues are many-to-1. It can have many producers but it only sends to one homogenous group of consumers. This easily becomes a problem with our setup because this means we have to build a new SQS queue for each group of consumers. For instance, we added a new action in our event that sends a recommendation of items based on the delivered order. This action is housed in the recommendation service. And as it is a new action, another SQS queue is required.
As our app gets more complex over time, we add more actions to our events. And more actions mean more SQS queues being added. This adds to the overhead of maintaining our system over time.
A reader has pointed o
Enter EventBridge
EventBridge addresses this pain point while still retaining the decoupling that SQS introduced. Instead of sending 4 different tasks to 4 different SQS queues, it just publishes the "order_delivered" event to an event bus in EventBridge.
An event bus is like an SQS queue but instead of being many-to-1, it is many-to-many. This means many different event producers can push events to an event bus, and the event bus sends the event to all consumer systems configured to listen to it.
Aside from that, it allows consumers to listen only to specific events and disregard other events. When we configure consumers to "listen" for events, we do so by creating event rules against the event bus. These rules help the consumer filter what events it receives from the event bus.
In our case, the payment service only listens for the "order_delivered" event. When an event for the "order_shipped" or "order_confirmed" happens, it is not sent to that service.
Conclusion
EventBridge allows microservices to communicate with one another in a decoupled manner. We only need one event bus for all our events. Hence, managing the actions of our events across services becomes simpler.
Next Steps
In the next post, we will create a simple demo to demonstrate the powers of EventBridge.
Photo by Denys Nevozhai on Unsplash
Top comments (4)
Hey.
Pls remember that SQS has from some time event filtering feature described there - aws.amazon.com/about-aws/whats-new...
Thus there can be a still valid options to use SQS instead of EB when you need to send hundreds of thousands of messages and 5 event filters is enough.
The EB can't handle such load I'm afraid
Thank you for this Tomasz. This is a deepdive tip, and I'll gladly update my post to include this information as well
I want to point out one additional structure that could have worked:
You can have Orders Service publish to an SNS Topic and then have multiple SQSs subscribed to the topic. The SQSs can then invoke their specified Lambda.
SQS provides the durable transport and is Lambda aware; SNS provides the fan-out capability of the event. SNS offers event filtering capabilities on subscription that matches Event Bridge's capabilities.
Don't get me wrong, Event Bridge is a great product for this. Just always be aware of the latency and throughput limits Event Bridge imposes that doesn't make it ideal for some use cases.
Agreed on this, thanks Danny. I'll update the post to take this into account. The throughput limits of EventBridge really does limit its usage for workloads with millions of events per day