Bullet-Proofing Serverless Infrastructures with Failure and Threat Detection

Kay Plößer

February 15th, 2021

When building cloud-based systems and serverless systems, in particular, it’s crucial to stay on top of things. Your infrastructure will be miles away from you and not a device you hold in your hands like when you build a frontend. That’s why adding a monitoring solution to your stack, which offers a pre-configured serverless failure detection, should be one of the first decisions.

When deciding on a monitoring solution, it’s a good idea to check out if it brings adequate alerting features. When it comes to serverless, most of the time from a new incident to a fix is lost to finding the problem, not the actual troubleshooting. This means a monitoring solution can shrink the biggest chunk of that equation.

Serverless is all about outsourcing non-differentiating work to managed services. If you aren’t in the business of selling authentication software, don’t build authentication software; buy it from someone who makes their money from it; the chances are they have one or more teams working on the solution and it will outperform yours in a heartbeat.

Internal AWS Monitoring is Hard to Grasp

As mentioned before, the challenge here is that you can’t look into these services. You use SQS, DynamoDB, or API Gateway, but you can’t directly monitor the servers these services are running on, let alone SSH into them to debug them. AWS has its own logging and tracing services in place. So you need to extract the data and set alarms there.

The problem with the AWS provided monitoring services is, they aren’t easy to use because they’re general solutions. CloudWatch doesn’t just log Lambda or SQS data; it logs all AWS services data. This can make it a bit fiddly to set the right alerts for all your services. A serverless system in constant flux to add new features, updates, and refactors requires you to meddle with these settings constantly.

Also, AWS monitoring solutions aren’t particularly frugal when it comes to what data they log. In a serverless system, your transactions usually span multiple services, all emitting large numbers of log lines. Combing through that amount of data costs time and, in turn, money. And more often than not, AWS Console just isn’t enough for serverless teams, especially when scaling.

How Dashbird can Help

Built specifically for serverless failure detection and debugging, Dashbird collects all the logs your AWS services write – no instrumentation needed. It sorts them into different categories like configuration errors, timeouts, out-of-memory events, etc. Dashbird gathers all your logs, it presents them to you in an easily understandable way. These events are also published in the Event Library with quick explanations of the causes and fixes. If you ever looked at the Cloudwatch logs for one Lambda invocation, you know that it can be a chore to find the right line.

Read more on how Dashbird innovates serverless monitoring.

Understanding Managed Services

Generic logging solutions have to be configured correctly. They bring much more flexibility, but the cost of getting everything set up right can be high. And costs don’t just mean license costs; there are excellent open-source solutions. Costs mean the time it takes to make all these configurations. It also means that you will probably miss a few errors until you’ve finally fine-tuned all configurations.

Dashbird also eases the pain of configuring important alerts for you. This doesn’t just mean that the UI is easier to understand than what AWS offers; it means that Dashbird comes with out-of-the-box pre-configured alerts right from the start. Dashbird understands AWS services; it’s not just a generic monitoring solution you staple onto your Lambda functions. These alerts include Dashbird’s know-how and suggestions on how to improve your system’s health and performance, gathered and built over the years from monitoring thousands of serverless systems running in production.

The mix of a hand-tailored monitoring solution for specific serverless AWS services and the know-how of production systems make Dashbird more comprehensive than other monitoring systems. This also means that Dashbird needs less manual configuration as a generic solution that has to be manually fitted to different services.

Dashbird will not just provide you with simple serverless failure detection but also alert you when they’re about to fail. This way, you can start to work on a solution before there even was an incident.

Automatically detects failures from Lambda invocations

Integrating with the Tools of Your Choice

Using an appropriate channel and format when alerting your developers is the other side of the coin. Integration with the services developers use on a daily basis is also important for alerting. Sure, sometimes it’s enough to send an email, but Dashbird also offers Slack, email, AWS SNS, and webhook integration.

This way, an alert will find developers where they are currently active, and they can respond right away, not just when they check their email two hours after a problem arises.

You don’t want to pay for monitoring just to get an email from a customer who tells you that something is wrong just because an alert hasn’t been noticed.

Integration also allows you to automate responses to specific problems. As the creator of your architecture, you know best what to do when your traffic spikes. Maybe, you need to provision more capacity, but maybe you just need to tell your customer that their quota is reached for the month, and they will now be throttled.

Conclusion

When choosing a monitoring solution for your serverless architecture, it’s of utmost importance to focus on their alerting features. In the last years, function as service solutions, like AWS Lambda, was sold as the central aspect of building serverless systems, but serverless so much more. Manged services like S3, DynamoDB, Cognito, Aurora, and SQS, help you to reduce the time and personnel needed to get new features out to the market frequently.

But these managed services don’t come without a cost. On the one hand, you don’t have to maintain your servers anymore; but on the other hand, you can’t SSH into these services and install whatever monitoring client you want.

Managed services are also an opportunity for monitoring providers. If they can break down the infrastructure you use to build your systems to a few concrete and well-known services, the help monitoring provides can also deliver less abstract. You don’t have to think about all the possible ways your system can and probably will fail, but instead, you can rely on the know-how of a monitoring provider like Dashbird to catch problems for you without the painful learning experience usually connected to serverless failure detection and monitoring.

Read our blog

Making serverless applications reliable and bug-free

In this guide, we’ll talk about common problems developers face with serverless applications on AWS and share some practical strategies to help you monitor and manage your applications more effectively.

ANNOUNCEMENT: new pricing and the end of free tier

Today we are announcing a new, updated pricing model and the end of free tier for Dashbird.

4 Tips for AWS Lambda Performance Optimization

In this article, we’re covering 4 tips for AWS Lambda optimization for production. Covering error handling, memory provisioning, monitoring, performance, and more.

Made by developers for developers

Dashbird was born out of our own need for an enhanced serverless debugging and monitoring tool, and we take pride in being developers.

Get started free or learn more

What our customers say

Dashbird gives us a simple and easy to use tool to have peace of mind and know that all of our Serverless functions are running correctly. We are instantly aware now if there’s a problem. We love the fact that we have enough information in the Slack notification itself to take appropriate action immediately and know exactly where the issue occurred.

Thanks to Dashbird the time to discover the occurrence of an issue reduced from 2-4 hours to a matter of seconds or minutes. It also means that hundreds of dollars are saved every month.

Great onboarding: it takes just a couple of minutes to connect an AWS account to an organization in Dashbird. The UI is clean and gives a good overview of what is happening with the Lambdas and API Gateways in the account.

I mean, it is just extremely time-saving. It’s so efficient! I don’t think it’s an exaggeration or dramatic to say that Dashbird has been a lifesaver for us.

Dashbird provides an easier interface to monitor and debug problems with our Lambdas. Relevant logs are simple to find and view. Dashbird’s support has been good, and they take product suggestions with grace.

Great UI. Easy to navigate through CloudWatch logs. Simple setup.

Dashbird helped us refine the size of our Lambdas, resulting in significantly reduced costs. We have Dashbird alert us in seconds via email when any of our functions behaves abnormally. Their app immediately makes the cause and severity of errors obvious.