Achieving loosely-coupled architectures with the asynchronous messaging pattern
A common anti-pattern in serverless projects is the extensive tight coupling between services. Having Lambda functions invoking eachother directly usually leads to a project that is difficult to deploy, maintain and extend.
This is a common outline of how serverless development roadmaps evolve:
Although this seems easy to implement, services will be tightly coupled. Developers might not feel the pain for small implementations. But as the overall system grows, it will become increasingly hard to maintain and evolve:
The Asynchronous Messaging pattern is very useful to decouple serverless functions and make systems more maintainable. Before we get into more details about the architecture itself, let’s analyze why coupling is an issue for serverless functions.
An e-commerce system is something most of us are quite familiar with. Consider when a customer submits a purchase. Several processes need to be performed:
The following architectural ideas are only for conceptual illustration purposes. They omit aspects that would be required in a production environment and might not be the most suitable for the particular context used as an example.
Consider there is a Lambda function to perform each of the actions above.
In a highly coupled architecture, there could be a central function called
SubmitOrder that would perform all these actions in a single run and imperatively.
Let’s analyze each disadvantage point mentioned above in light of this example:
Observe that each step of the process (in grey) blocks the execution of the main function Submit Order (in blue). Also, multiple I/O-bound processes are running in background (in yellow).
As a result, the main function will be waiting idle for considerable periods of time. This leads to poor resource usage and higher costs, since serverless functions are charged per execution time.
What happens in the above outline if the Decrease Stock function fails? Perhaps the database was under heavy load and couldn’t process the decrement query.
Surely the request could be retried. But now there’s one more thing that needs to be coordinated by the Submit Order function, which increases complexity. As a result, chances of experiencing even more errors only increase.
To reduce complexity, let’s not handle retrying of individual components within the Submit Order function. If anything fails, the main function will be retried from start.
How should a failure within the Decrease Stock step be handled? Running the entire Submit Order request again will authorize payment twice.
Should the previous payment authorization be canceled? If we keep it, the main function must skip the payment step in the retry. But now there’s one more logic to handle, which also adds complexity.
We can see that tight coupling of services is already making it difficult to manage our system and make it perform in a reliable way without much hurdles.
Idempotency is a “property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application".2.
In other words: the Submit Order function would be idempotent if it was possible to retry it in case of failure without worrying with double charges. We dedicated an entire page in our Knowledge Base to cover the Lambda Retry behavior and Idempotency.
Every single serverless function should be idempotent. That is a trait of reliable and maintainable serverless systems.
What we need to know is: accomplishing idempotency for the Submit Order function would be a lot harder and complex in the tightly coupled architecture outlined above.
Let’s consider a different strategy and compare against the first one. When a customer submits a purchasing order, the Submit Order function only dispatches two messages:
The main function is now streamlined to only interact with a message queue service:
Each step in the process is now decoupled, having asynchronous messages as the means to make the system work cohesively.
Messages are further consumed by each step of the process independently. The “Set aside stock” for example, would pull pending messages from the queue and reserve products in the database.
Having each step working independently improves resource utilization and reduces costs by cutting down idle I/O time. It is also easier now to handle retries in case anything fails. When a failure occurs within one step of the process, the other functions will not be impacted.
Another benefit is being able to replace of modify one service without having to touch others. Consider the system was only sending a confirmation e-mail to the customer. Now, sending an SMS message becomes a requirement. In the tightly coupled architecture, it would be harded to introduce any changes independently. With an asynchronous messaging architecture, we only need to add one message consumer.
Lastly, ensuring scalability and reliability of a decoupled architecture is much easier. Consider Service A publishes a message for Service B to get a task done. In case Service B relies on a database, for instance, that poses scalability difficulties, Service A can still scale up without a problem. Messages may start to pile up for some time, but Service B will eventually catch up.
Even if this situation is undesirable (a list of messages piling up and taking more time to process than usual), it is better than having Service B crashing due to a peak demand coming from tight coupling with Service A.
Below is a brief introduction to the main options for handling asynchronous messages in a distributed serverless environment. We plan to cover each of them and their respective AWS services with a dedicated page in this Knowledge Base.
We mentioned a “message queue” in the example for loosely-coupled architecture above. This is exactly what the name suggests: messages are received and piled up by the messaging system in a queue. A consumer can pull messages from the queue for processing. When processing is finished, the messaging system is notified to delete the message. If it fails, the message goes back to the queue.
The main components of a Pub/Sub system are:
In the “confirmation message” example, if we had only an e-mail sending mechanism and wanted to add SMS messaging as well, it would be a matter of deploying an SMS sender service and subscribing it to the “customer notifications” topic. The rest of the system remains unaware of this change. This makes the overall project a lot more manageable and extensible.
Read our detailed page about the Pub/Sub pattern.
An Event Bridge is somewhat similar to the above two, but with a difference: messages (called events) are matched to subscribers depending on a fine-grained pattern matching mechanism. Events that match a certain pattern would be forwarded to a particular subscriber responsible for processing it.
A Stream Processing system would be more suitable for continuous data ingestion. An example would be application or database logs, tracking a series of clicks in a user interface for analytical purposes, etc. It is not quite in the exact same category as the others in this list, but it can be helpful for batching.
Stream Processing services will usually allow to group multiple datapoints and invoke a service processor every 10 minutes, for example, or whenever the amount of data ingested reaches 10 megabytes. This batching mechanism can help improve performance and reduce costs in serverless architectures.
Save time spent on debugging applications.
Increase development velocity and quality.
Get actionable insights to your infrastructure.