Monitoring platform for keeping systems up and running at all times.
Full stack visibility across the entire stack.
Detect and resolve any incident in record time.
Conform to industry best practices.
Systems run into problems all the time. To keep things running smoothly, we need to have an error monitoring and logging system to help us discover and resolve whatever issue that may arise as soon as possible. The bigger the system the more challenging it becomes to monitor it and pinpoint the issue. And with serverless systems with 100s of services running concurrently, monitoring and troubleshooting are even more challenging tasks.
In my last article, I introduced Dashbird.io, serverless monitoring and observability platform specifically designed to provide enhanced monitoring, actions, and architectural improvements for your AWS-based serverless systems and how it fills the gaps left by traditional monitoring services.
To learn more about Dashbird’s features and how to set it up, check out my previous article.
Dashbird has a lot of cool features to offer and in this article, I’ll be specifically showcasing how we can use those cool features to help debug serverless systems. For demonstration purposes, I have devised a lambda-based serverless use case.
The architecture is simple enough, the client uploads images to a source S3(to learn more about S3 click here) bucket. An S3 trigger is set up between our Lambda function (to learn more about lambda service click here) which is invoked by S3 whenever a PUT request(an image is uploaded) is made on the source bucket. The lambda function extracts the metadata information of the image and saves the data in our destination DynamoDB table. Check out this article to learn more about S3 and AWS Lambda Triggers.
We know AWS Lambda has its own set of challenges, a lot of times we run into various issues such as function time outs, out-of-memory issues, python exceptions, configuration errors, etc. Using the above architecture I’ll be creating various scenarios to mock various challenges with AWS Lambda these include:
These challenges can be very well addressed using Dashbird which provides very precise information regarding each of the above-mentioned issues.
In this part of the article, I’ll focus on how we can use Dashbird to efficiently manage some of the common challenges we face with AWS Lambda. If you haven’t configured Dashbird with your AWS account yet check out my last article to learn how.
Cold starts are a major contributor to degrading lambda performance. Especially for real-time systems, the impacts of cold starts are not very desirable since with each added lambda cold start the latency experienced increases. Dashbird helps us to very efficiently analyze which of our lambda functions faced cold starts and its impact on the function latency.
From the Inventory module, we can easily analyze which of our functions are facing cold starts and from the Alarms module, we can set up alarms for timely alerting us whenever lambda cold starts exceed a certain threshold.
— Monitoring Cold Starts and Function Latency:
— Setting up Alarms for Lambda Cold Starts and Latency Issues:
Other than monitoring lambda latency and cold starts, Dashbird also provides us with a great system alert offering using which we can set up metric-based alarms to timely notify us whenever our lambda functions are facing cold starts more than a certain desirable threshold or whenever a function’s execution time is exceeding a certain time limit. I will discuss setting up lambda alarms in the coming section on Dashbird Alarms.
Lambda functions have a timeout configuration which is the maximum amount of execution time can have after which the function automatically timeouts. The minimum can be 1sec and the maximum value for a timeout can be 15mins(as of now). Oftentimes, our computations may exceed the timeout limit that we have set. For any such cases, we would want to monitor which functions are facing timeouts and act accordingly. With Dashbird, we can easily deal with Lambda timeouts:
Just like Lambda has a timeout configuration parameter, there’s also a memory parameter that defines the maximum amount of memory a Lambda function can use. The minimum we can set is 128MB and it can go up to a maximum of 10GB(as of now). When performing a memory-intensive task, it is very likely that the function’s memory consumption goes beyond the memory limit we have set for our lambda function in which case, the function will throw an “Out of Memory Exception”, halting the function’s execution. Dashbird has also got this covered and provides us with proper insights and alerts to handle all “Out of Memory” exceptions.
Lambda configuration errors are related to failed function initialization usually due to some improper import. The function execution halts and the function is unable to initialize because of some issue with some module we’re trying to import. Just like other Lambda-related errors Dashbird has also got this covered.
Dashbird not only helps us monitoring the errors with our serverless systems but also keeps us updated with the cost of operating our infrastructure.
Note — since I have AWS free tier so my total costs are $0 for now.
“Alarms” is a dedicated module by Dashbird to help us create custom alarms for our resources based on some metrics related to that resource. This significantly improves the MTTD/R. Creating an alarm is simple enough:
A very cool feature in Dashbird that I would like to lastly discuss is its Log Search module. This module allows us to do some advanced-level search on all our logs helping us to efficiently filter out the logs that are of value to us.
You can find the code used for this article on my Github account.
Further reading:
Building complex Well-Architected serverless applications
AWS CloudWatch alerts vs Dashbird alerts
Securing serverless applications with critical logging
In this guide, we’ll talk about common problems developers face with serverless applications on AWS and share some practical strategies to help you monitor and manage your applications more effectively.
Today we are announcing a new, updated pricing model and the end of free tier for Dashbird.
In this article, we’re covering 4 tips for AWS Lambda optimization for production. Covering error handling, memory provisioning, monitoring, performance, and more.
Dashbird was born out of our own need for an enhanced serverless debugging and monitoring tool, and we take pride in being developers.
Dashbird gives us a simple and easy to use tool to have peace of mind and know that all of our Serverless functions are running correctly. We are instantly aware now if there’s a problem. We love the fact that we have enough information in the Slack notification itself to take appropriate action immediately and know exactly where the issue occurred.
Thanks to Dashbird the time to discover the occurrence of an issue reduced from 2-4 hours to a matter of seconds or minutes. It also means that hundreds of dollars are saved every month.
Great onboarding: it takes just a couple of minutes to connect an AWS account to an organization in Dashbird. The UI is clean and gives a good overview of what is happening with the Lambdas and API Gateways in the account.
I mean, it is just extremely time-saving. It’s so efficient! I don’t think it’s an exaggeration or dramatic to say that Dashbird has been a lifesaver for us.
Dashbird provides an easier interface to monitor and debug problems with our Lambdas. Relevant logs are simple to find and view. Dashbird’s support has been good, and they take product suggestions with grace.
Great UI. Easy to navigate through CloudWatch logs. Simple setup.
Dashbird helped us refine the size of our Lambdas, resulting in significantly reduced costs. We have Dashbird alert us in seconds via email when any of our functions behaves abnormally. Their app immediately makes the cause and severity of errors obvious.