Early-detection of Potential Sources of Failure in Serverless

We recently wrote about why serverless applications fail and how to design resilient architectures. Being able to detect early-stage failure indicators can be invaluable.

With proper monitoring, developers move from waiting for the system to crash and adopt a more proactive attitude in managing resource allocation and architecture design to avoid bottlenecks and performance degradation. This leads to end-user satisfaction, trust among executive team members, and a healthy stream of support requests for customer care agents.

The main challenge is that, even though Serverless abstracts away most traditional infrastructure management, there are still numerous architectural complexities at our hands.

The number and variety of cloud resources we use in serverless applications are growing. Each service has its own intricacies and limitations. Interactions between services increase complexity in rapid proportions. It is difficult to track everything and keep on top of all aspects of such architecture.

SQS queue monitoring

Take Queues, for instance. They have to be verified constantly for latency causing high delays, an unusual accumulation of messages in the queue, etc. Compute services, such as AWS Lambda functions or ECS containers have to be monitored for a variety of possible faults, such as high resource (e.g. memory) consumption.

ECS Monitoring

It is also important to have all monitoring metrics, performance and architectural insights in one place so that developers can be efficient to discover and act upon potential issues. Most monitoring platforms, though, still apply a server-based mindset, which doesn’t fit well the serverless paradigm.

Cloud resources cannot be monitored isolated, we must start thinking about our serverless backends as a whole, and almost as living organisms. Otherwise, issues arising from the interaction of services are difficult to track and detect early on.

Dashbird comes embedded with dozens of algorithms for early-detection and alerting of issues in Serverless platforms: software exceptions, infrastructure faults, platform errors. Try it for 14 days free, no credit card required.

A collection of lessons learned at Dashbird after working with 4,000+ customers and 300,000+ Lambda functions

Write for us!

We're looking for developers to share their experience with serverless.

Emails and pull requests welcome!

Start using Dashbird for free!

Failure detection, analytics and visibility for serverless applications in under 5 minutes.

Request Demo