Monitoring platform for keeping systems up and running at all times.
Full stack visibility across the entire stack.
Detect and resolve any incident in record time.
Conform to industry best practices.
How to choose a decoupling service that suits your use case? In this article we’ll take you though some comparisons between AWS services – Kinesis vs SNS vs SQS – that allow you to decouple sending and receiving data. We’ll show you examples using Python to help you choose a decoupling service that suits your use case.
Decoupling offers a myriad of advantages, but choosing the right tool for the job may be challenging. AWS alone provides several services that allow us to decouple sending and receiving data. While on the surface those services seem to provide similar functionality, they are designed for different use cases and each of them can be useful if applied properly to the problem at hand.
As one of the oldest services at AWS, SQS has a track record of providing an extremely simple and effective decoupling mechanism. The entire service is based on sending messages to the queue and allowing for applications (ex. ECS containers, Lambda functions) to poll for messages and process them. The message stays in the queue until some application picks it up, processes it, and deletes the message when it’s done.
The most important distinction between SQS and other decoupling services is that it’s NOT a publish-subscribe service. SQS has no concept of producers, consumers, topics, or subscribers. All it does is to provide a distributed queue that allows:
SQS does not push messages to any applications. Instead, once a message is sent to SQS, an application must actively poll for messages to receive and process them. Also, it’s not enough to pick up the message from the queue to make it disappear — the message stays in the queue until:
The code snippet below demonstrates how you can:
Here is a link to a Github gist showing the same code as below.
By default, SQS does not guarantee that the messages will be processed in the same order they were sent to the queue unless you choose the FIFO queue. This can be easily configured when creating a queue.
sqs.create_queue(QueueName=queue_name,
Attributes={‘VisibilityTimeout’: ‘3600’, ‘FifoQueue’: ‘true’})
Even though SNS stands for Simple Notification Service, it provides much more functionality than just the ability to send push notifications (emails, SMS, and mobile push). In fact, it’s a serverless publish-subscribe messaging system allowing to send events to multiple applications (subscribers) at the same time (fan-out), including SQS queues, Lambda functions, Kinesis Data Streams, and generic HTTP endpoints.
In order to use the service, we only need to:
How to decide whether you need to use SQS vs. SNS? Anytime multiple services need to receive the same event, you should consider SNS rather than SQS. A message from an SQS queue can only be successfully processed by a single worker node or process. Therefore, if you need a fan-out mechanism, you need to create an SNS topic and implement queues for all applications that need to receive the specific event or data. Multiple queues can then subscribe to this SNS topic and receive the messages simultaneously.
For instance, imagine a scenario as simple as having the possibility to publish the same event (message) to both the development (staging) and production environment:
Again, here is a simple Python script that demonstrates how to:
AWS provides an entire suite of services under the Kinesis family. When people say Kinesis, they typically refer to Kinesis Data Streams — a service allowing to process large amounts of streaming data in near real-time by leveraging producers and consumers operating on shards of data records.
Apart from Kinesis Data streams, the “Kinesis family” includes:
To demonstrate how Kinesis Data Streams can be used, we will request the current cryptocurrency market prices (data producer) and ingest them into a Kinesis data stream.
To create a data stream in the AWS console, we need to provide a stream name and configure the number of shards:
Then, we can start sending live market prices into the stream. In the example below, we send them every 10 seconds. Here is a link to a Github gist with the same code.
The script will run indefinitely until we manually stop it.
So far, we configured a Kinesis data producer, sending real-time market prices to the data stream. There are many ways to implement a Kinesis consumer — for this demo, we’ll implement the simplest method which is to leverage a Firehose delivery stream.
We can configure Kinesis Data Firehose to send data to S3 directly in the AWS console. We need to select our previously created data stream and for everything else, we can apply the defaults.
The most important part is to configure the destination — in our use case, we choose S3 and select a specific bucket:
We need to configure how frequently the micro-batches of data should be sent to S3:
We can then confirm to create a delivery stream:
Shortly after the delivery stream’s creation, we should be able to see new data arriving every minute in our S3 bucket:
To see how this data looks like, we can download one file from S3 and inspect its content:
Even though Kinesis Data Streams is serverless, it requires proper allocation of data across shards. One possible way to track any write throttles is to use Dashbird. In the image below, we can see how many records are sent to the stream each minute. It shows us that we don’t always receive exactly 10 records per minute.
Dashbird allows you to configure alerts on write throttles:
This is how the alert could look like if triggered:
Among the three services from the title, Kinesis is the most difficult one to use and operate at scale. It’s best to start with an SNS topic and one or more SQS queues subscribed to it. Kinesis shines when you need to perform map-reduce-like operations on streaming data, as it allows you to aggregate similar records and build real-time analytical applications. At the same time, monitoring shards and Kinesis stream throughput adds some additional complexity and increases the error space where something can go wrong.
If your only argument for Kinesis Data Streams is the ability to replay data, you could achieve the same by introducing a Lambda function that subscribes to the SNS topic and loads all received messages to some database such as DynamoDB or Aurora. By leveraging a timestamp of data insertion, you would know exactly when a specific message was received which simplifies debugging in case of errors.
To make it easier to choose a decoupling service for your use case, I created a table comparing features and characteristics of those three services.
In this article, we looked at the differences between SNS, SQS, and Kinesis Data Streams. We’ve built a simple demo showing how to send data to Kinesis from data producers, how a delivery stream can consume the data, and how to monitor any potential write throttles. For each service, we demonstrated how it can be used in Python and concluded with a table comparing the service characteristics.
References and further reading:
Triggering AWS Lambda with SNS
6 Common Pitfalls of AWS Lambda with Kinesis Trigger
Amazon Kinesis Data Streams Terminology and Concepts
Why should I use Amazon Kinesis and not SNS-SQS?
In this guide, we’ll talk about common problems developers face with serverless applications on AWS and share some practical strategies to help you monitor and manage your applications more effectively.
Today we are announcing a new, updated pricing model and the end of free tier for Dashbird.
In this article, we’re covering 4 tips for AWS Lambda optimization for production. Covering error handling, memory provisioning, monitoring, performance, and more.
Dashbird was born out of our own need for an enhanced serverless debugging and monitoring tool, and we take pride in being developers.
Dashbird gives us a simple and easy to use tool to have peace of mind and know that all of our Serverless functions are running correctly. We are instantly aware now if there’s a problem. We love the fact that we have enough information in the Slack notification itself to take appropriate action immediately and know exactly where the issue occurred.
Thanks to Dashbird the time to discover the occurrence of an issue reduced from 2-4 hours to a matter of seconds or minutes. It also means that hundreds of dollars are saved every month.
Great onboarding: it takes just a couple of minutes to connect an AWS account to an organization in Dashbird. The UI is clean and gives a good overview of what is happening with the Lambdas and API Gateways in the account.
I mean, it is just extremely time-saving. It’s so efficient! I don’t think it’s an exaggeration or dramatic to say that Dashbird has been a lifesaver for us.
Dashbird provides an easier interface to monitor and debug problems with our Lambdas. Relevant logs are simple to find and view. Dashbird’s support has been good, and they take product suggestions with grace.
Great UI. Easy to navigate through CloudWatch logs. Simple setup.
Dashbird helped us refine the size of our Lambdas, resulting in significantly reduced costs. We have Dashbird alert us in seconds via email when any of our functions behaves abnormally. Their app immediately makes the cause and severity of errors obvious.