Problems With AWS Lambda Observability And How Dashbird Solves Them

Annika Helendi

September 4th, 2020

The serverless space is maturing and its usage is skyrocketing (AWS Lambda being the de facto leader in this space). Since it has brought a whole paradigm change to software engineering, we are seeing some unique challenges as well. Serverless architectures are built in a totally different way. Sooner or later you will end up with hundreds or even thousands of functions that are connected with countless other services/endpoints.

Since you don’t have to worry about the infrastructure on the server side of things anymore, you won’t have the same kind of access to all the monitoring, tracing, debugging, alerting, and log aggregation options that you used to have with “old-school workflows” or container-based setups. Serverless observability is a real issue. That’s exactly the reason why Dashbird was launched in 2017. So, in this article I’m listing all the biggest observability challenges serverless computing brings and how Dashbird helps to solve them.

No top-level visibility

AWS Lambda is offering CloudWatch for monitoring your functions, but it only performs up to a point when you are just starting out with serverless and only have a few lambda functions to monitor. It automatically collects metrics and logs, but regarding the bird’s-eye-view, you are left in the dark. You’ll never know how good the overall health of your serverless application is. Especially when it grows to a larger scale and introduces more complexity.

So to get a centralized overview of your serverless stack’s health, you should sign up for a free Dashbird account. Unlike some of the competitors, Dashbird doesn’t use any wrappers, but collects and visualizes actionable data from your CloudWatch logs. You just need to connect your AWS account – the whole setup takes less than 2 minutes and there are no code changes involved.

Dashbird’s main dashboard consists of time-series metrics for invocation counts, invocation durations, memory usage vs the allocated maximum, health statistics, and error reports. This gives you the incredibly valuable, central, helicopter-view that becomes a must when you have more than 10 lambda functions.

serverless dashboard overview

Dashbird’s main dashboard with a full system health overview with real-time metrics.

Bonus tip #1: Dashbird allows you to group functions any way you like by using their names. This enables you to construct custom dashboards for service-level monitoring, outlining the load, errors, and other important metrics. This is especially useful if your AWS account includes lambdas for several independent services.

Bonus tip #2: Dashbird also provides daily report emails with key points of interest compiled from the invocation logs from the previous 24 hours.

Finding failures = looking for a needle in the haystack

Endpoints can perform slower than you think and if you don’t measure them, you will never know. Worse yet, your users probably will. You also might be close to timeouts or memory limits, without even knowing about it. Also, you might not expect some functions to fail, but they do anyway.

Dashbird can detect all kinds of errors: crashes, early exits, timeouts, and configuration errors that are unique for serverless technology.

Dashbird provides instant and actionable failure detection.

Bonus tip: Dashbird offers notifications through email and integration with Slack with a short description of the error and a direct link to the failed invocation view which makes finding the error really quick and easy.

serverless alerting

Debugging is painful and a major time suck

Debugging serverless applications with AWS tools is often described as pure pain. It takes a lot of time to find out what exactly failed, why it failed and how to fix it.

Dashbird breaks down every function, showing stack traces and context which makes debugging lambdas really easy.

The killer feature that brings most value is actually a simple one – the ability to search through the logs for one or multiple functions.

traces and tailing

Dashbird’s live tailing feature offers near real-time troubleshooting.

Bonus tip: Dashbird also links directly to relevant X-Ray traces and X-Ray is very helpful for identifying slow database calls.

No insights on how to best optimize your stack

As mentioned above, going “serverless” will leave you in the dark with a lot of things. Missing out on the opportunities to optimize the costs or the performance of your application is a big challenge because you not only need good failure detection and general overview, but also great context to make these decisions. Right now, none of the tools provided by AWS or by old-school monitoring companies solve the problem.

Fortunately, Dashbird is offering various information about the weak points in your system that could need some optimization. For example, you can detect how many cold starts your functions are experiencing and optimize the user experience based on that information.

Bonus tip: There are ways to deal with cold starts. Check this out for reference.

Dashbird lets you check the execution timelines, memory usage stats, and how much each function is costing you.

Monitoring overhead can mess with your code

If you send your logs as part of your function, it will have an effect on the user-side latency as well. Using monitoring services that add a wrapper around every function will run into a risk of interfering with your function as well. These are the things you would like to avoid.

Since Dashbird doesn’t require wrappers, but instead works by collecting logs, metrics, and listing resources under your AWS account, there’s absolutely no user-facing latency.

Dashbird just needs limited access to your AWS account. After completing the registration form, a custom CloudFormation template is generated for you.

Wrapping up

Observability is hard, that’s no secret. Serverless is still young, and has its teething issues. By using the right tools for the job, like Dashbird for observability, and properly structuring your logs in a logical manner, many of the issues can be mitigated. Hope you enjoyed the read, catch you next time. Until then, stay safe, and make sure to alert your errors!

Are you using serverless in production and have some insights to share? We are always looking to improve Dashbird based on the feedback of real serverless users. Get in touch!

Read our blog

Making serverless applications reliable and bug-free

In this guide, we’ll talk about common problems developers face with serverless applications on AWS and share some practical strategies to help you monitor and manage your applications more effectively.

ANNOUNCEMENT: new pricing and the end of free tier

Today we are announcing a new, updated pricing model and the end of free tier for Dashbird.

4 Tips for AWS Lambda Performance Optimization

In this article, we’re covering 4 tips for AWS Lambda optimization for production. Covering error handling, memory provisioning, monitoring, performance, and more.

Made by developers for developers

Dashbird was born out of our own need for an enhanced serverless debugging and monitoring tool, and we take pride in being developers.

Get started free or learn more

What our customers say

Dashbird gives us a simple and easy to use tool to have peace of mind and know that all of our Serverless functions are running correctly. We are instantly aware now if there’s a problem. We love the fact that we have enough information in the Slack notification itself to take appropriate action immediately and know exactly where the issue occurred.

Thanks to Dashbird the time to discover the occurrence of an issue reduced from 2-4 hours to a matter of seconds or minutes. It also means that hundreds of dollars are saved every month.

Great onboarding: it takes just a couple of minutes to connect an AWS account to an organization in Dashbird. The UI is clean and gives a good overview of what is happening with the Lambdas and API Gateways in the account.

I mean, it is just extremely time-saving. It’s so efficient! I don’t think it’s an exaggeration or dramatic to say that Dashbird has been a lifesaver for us.

Dashbird provides an easier interface to monitor and debug problems with our Lambdas. Relevant logs are simple to find and view. Dashbird’s support has been good, and they take product suggestions with grace.

Great UI. Easy to navigate through CloudWatch logs. Simple setup.

Dashbird helped us refine the size of our Lambdas, resulting in significantly reduced costs. We have Dashbird alert us in seconds via email when any of our functions behaves abnormally. Their app immediately makes the cause and severity of errors obvious.