My honest review: I tried AWS Serverless Monitoring using Dashbird.io

Mariliis Retter

August 5th, 2021

Disclaimer: This article was written by BK Lim, Co-founder of Interviewer.AI, and originally published on Medium. The information provided is solely based on his personal usage and opinion on the Dashbird platform.

tl;dr
Using Dashbird.io allows us to monitor our AWS Serverless resources better and helps us nailed down on specific errors quickly and more efficiently.

As a startup, we always want to focus on the most important thing — to deliver value to our customers. For that reason, we are a huge fan of the serverless options provided by AWS (Lambda) and GCP (Cloud Function) as these allow us to maintain and quickly deploy bite-size business logic to production, without having to worry too much about maintaining the underlying servers and computing resources. Additionally, using services like AWS Step Functions allows us to orchestrate the Lambda function in a high-level visual and low-code fashion, at the same time allowing us to execute different functions in a specific order.

step functions workflow — Sample AWS Step Function Visual Workflow — source: AWS

Monitoring the execution of Lambda functions starts to become a real issue when you have tens or hundreds of functions running at the same time. When we first started, we relied a lot on Cloudwatch Logs as the Lambda function forwards the execution logs to Cloudwatch automatically without any additional setup. This helps when the number of invocation is small, but as you can see in the screenshot down below, the native log streams provided by Cloudwatch doesn’t contain the necessary information to help to debug or to troubleshoot the errors:

aws cloudwatch logs — How do we know which invocation was successful or failed?

A lot of times we have to look at the timing of the failed invocation and click on a specific log stream around the timestamp, only to realize that it was not the log stream we are interested to see. Cloudwatch also clubs the logs from multiple invocations if they were close to each other, potentially making important information harder to find for developers. Before we were introduced to monitoring products like Dashbird.io, we were manually creating additional logic in the Lambda function to send off Slack notifications when an error happens in the function, resulting in a bigger deployment than what is needed.

Instead of creating a serverless monitoring tool ourselves, we were exploring off-the-shelf monitoring options in the market. Dashbird.io was one of the services we explored. In this section, I will share the experience of using Dashbird.io, particularly on the onboarding process and the main offering by the platform.

Dashbird.io offers a forever-free tier for smaller infrastructures of up to 1 million invocations and a free 2-week trial that encompasses the professional plan. Both include not just AWS Lambda monitoring, but also additional AWS-managed services such as Step Function, ECS and more. After signing up an account, we were asked to launch the CloudFormation stack in AWS, which basically creates a role that has read-only permission to access those services that Dashbird.io will monitor. The instructions are lined up clearly as shown below:

dasbhird onboarding — Deploy CF stack, and wait.

One great thing to point out here is that all the permissions that are being requested are read-only access. For companies that are particular about the third-party product having access to your computing resources, this is definitely an advantage compare to services that require write permission.

Once the role is created and the Role Arn is copied and pasted here, Dashbird will start syncing information from the AWS account and soon enough we will start seeing information populated in the dashboard.

dashbird aws dashboard — Main Dashboard Overview

The insights section is something refreshing to me, as it provides tips and best practices to optimize our cloud resources.

Moving on to the main objective which is to monitor our resources (in Dashbird.io, it’s on the Inventory tab), we have a high-level overview of our resources. At the point of writing, Dashbird.io supports AWS Lambda, ECS, Step Function, SQS and API Gateway.

Filtering down to one of our Lambda functions, we can see the metrics on the function such as invocations count, errors count etc.

Lambda function metrics — Information and metrics about a Lambda function

lambda invocations — Individual Lambda invocation, errors, and insights

There are 3 things I like about the information here:

Each invocation has its own logs, and errors can be identified easily with the red bug icon
Errors are aggregated by type (we have 100+ connection aborted errors 🙁 ) — this helps us in identifying the common issues and we can focus our effort on solving them
Some insight about the Lambda function and how we can optimize it

That basically summarizes how I use Dashbird so far. There are other tabs that I won’t go into detail about in this article, but a summary of them down here:

Events: List of events (error, insight, alarm) of all the monitored resources
Alarms: Place to configure alerts
Well-architected lens: All the insight on monitored resources
Resource-group: Something similar to tagging so you can monitor your resources in different service grouping
Log-search: As the name suggests

Positive points about Dashbird.io

Very quick and easy onboarding, for me it takes slightly longer than the 2 minutess Dashbird.io claimed during the onboarding process, but it was definitely hassle-free
Role created by Dashbird.io only requires read access
Information provided for Lambda invocation is very useful — errors can be identified easily, and similar errors are aggregated together so that you have a high-level understanding of the issue.
Insights provided could be useful to optimize your serverless resources, eg: “Function is not tagged”, “ECS Service reserved memory is near limit”. Gives information on the potential cost of running the AWS resources too.
Alerts/Alarms can be customized from the UI based on the service type (eg: Lambda has errors count, retry count, cold start count; Step Function has failed execution, timed-out execution etc.)
Support multiple AWS managed services in addition to AWS Lambda (eg: Step Function, ECS, SQS & API Gateway)

Things I hope Dashbird.io can cover in the future

More consistent / updated UI (there are some serverless services that have configuration screen and some don’t, but both show the configuration icon)
The bread and butter of Dashbird is AWS Lambda; IMO, information for other services such as Step Function and ECS can be improved. At the point of writing, Step Function doesn’t show the failed invocation details like how they are being displayed on AWS Console which to me is something fairly important.
To cover other cloud providers like GCP and Azure

Summary

Using a third-party monitoring platform like Dashbird.io can help the development team in identifying the root cause of a problem and optimize the cloud resources better. As mentioned at the start of the article, as a startup, you probably want to focus more on delivering value to the customer, and not spending engineering resources in searching through CloudWatch logs or building a separate monitoring tool.

Read our blog

Making serverless applications reliable and bug-free

In this guide, we’ll talk about common problems developers face with serverless applications on AWS and share some practical strategies to help you monitor and manage your applications more effectively.

ANNOUNCEMENT: new pricing and the end of free tier

Today we are announcing a new, updated pricing model and the end of free tier for Dashbird.

4 Tips for AWS Lambda Performance Optimization

In this article, we’re covering 4 tips for AWS Lambda optimization for production. Covering error handling, memory provisioning, monitoring, performance, and more.

Made by developers for developers

Dashbird was born out of our own need for an enhanced serverless debugging and monitoring tool, and we take pride in being developers.

Get started free or learn more

What our customers say

Dashbird gives us a simple and easy to use tool to have peace of mind and know that all of our Serverless functions are running correctly. We are instantly aware now if there’s a problem. We love the fact that we have enough information in the Slack notification itself to take appropriate action immediately and know exactly where the issue occurred.

Thanks to Dashbird the time to discover the occurrence of an issue reduced from 2-4 hours to a matter of seconds or minutes. It also means that hundreds of dollars are saved every month.

Great onboarding: it takes just a couple of minutes to connect an AWS account to an organization in Dashbird. The UI is clean and gives a good overview of what is happening with the Lambdas and API Gateways in the account.

I mean, it is just extremely time-saving. It’s so efficient! I don’t think it’s an exaggeration or dramatic to say that Dashbird has been a lifesaver for us.

Dashbird provides an easier interface to monitor and debug problems with our Lambdas. Relevant logs are simple to find and view. Dashbird’s support has been good, and they take product suggestions with grace.

Great UI. Easy to navigate through CloudWatch logs. Simple setup.

Dashbird helped us refine the size of our Lambdas, resulting in significantly reduced costs. We have Dashbird alert us in seconds via email when any of our functions behaves abnormally. Their app immediately makes the cause and severity of errors obvious.