Getting Started with Dashbird
Dashbird is an observability, debugging, and intelligence platform designed specifically to help serverless developers build, operate, improve, and scale their modern cloud applications on AWS environment fast, securely, and with ease. It’s free to use for up to 1M invocations and doesn’t require any code changes.
Serverless architecture fundamentally changes how we build, deploy, and maintain software. Although AWS CloudWatch can be used to monitor cloud resources, it was not designed for the challenge and doesn’t have all the necessary data readily available. Dashbird fills the gaps left by CloudWatch and other traditional monitoring tools by offering enhanced out-of-the-box monitoring, operations, and actionable insights tools for architectural improvements, all in one place.
Dashbird’s approach is fairly simple, all the mission-critical data of your entire serverless system is placed in a single dashboard giving you a birds-eye-view of the entire system activity. Moreover, you get immediate alerts on any errors or warnings that may arise and get pointed to the exact point of failure in the system so it can be resolved fast.
The 3 core pillars of Dashbird are:
- Real-time end-to-end serverless observability
- Automatic Failure Detection
- Continuous Well-Architected reports on your entire infrastructure
Real-time full serverless observability for AWS apps
Dashbird monitors multiple cloud components in AWS cloud, such as Lambda functions, API Gateways, SQS queues, ECS containers, DynamoDB tables, Step Function state machines, Kinesis data streams, Relational Database Service (RDS), Elastic Load Balancers (ELB), HTTP API Gateway, OpenSearch and Simple Notification service(SNS).
The Main Dashboard — Dashbird collects the monitoring data of the system automatically from the AWS cloud environment. The data collected(which includes logs, metrics, and traces) is then centralized, summarized on a very visually pleasing dashboard giving you real-time observability.
The main dashboard has all the necessary information about your system which can be used closely to monitor the activity of the system. It graphically displays information regarding total invocations, total errors occurred, total warnings produced, the total cost incurred, and billed duration. It also displays which service produced what errors or warnings, what alarm went off and information regarding most frequently occurring errors and most actively used functions etc. From the dashboard, you can easily navigate to the core of the problem and take the necessary action to resolve the issue.
The Inventory Service — the inventory service is a single-pane-of-glass view for all the cloud resources of the user’s system. You get complete data regarding the logs, metrics, traces, errors, and any anomalies for a specific resource. The resources are grouped and organized by type of resource. Moreover, you get a complete section for metrics for time series data, a list of executions(for lambda functions) and errors, and also any actionable insights. This allows for an effective and effortless observability process over the entire serverless cloud stack.
Log Search and Analytics — Dashbird is equipped with a very powerful log search module which is powered by ElastiSearch. You can search across logs of multiple resources at once and can filter log results based on keywords, resources, projects, status(like error or success), date range.
Resource Groups — Resource Groups allow you to group and organize several resources together specific to a business use case. This allows us to analyze and debug all the resources in a resource group as a single unit which could be rather difficult if debugging is done individually for each resource. A custom metrics board is created for each resource group or project to allow you to see system-wide metrics.
Dashbird started out as an AWS Lambda focussed observability platform and to this day, Lambda observability is our bread and butter. Different from AWS CloudWatch, Dashbird individualizes each invocation log and includes metrics and traces to make it easier to debug any potential issues or performance bottlenecks.
Aggregated metrics are also provided for each function. Detailed statistics include average, minimum, maximum, and 99th percentile.
Multiple dimensions are aggregated for each function:
- Memory utilization
These metrics support not only function health analysis but also resource and cost improvements.
Dashbird conducts continuous real-time assessments of your current system architecture and benchmarks them against the industry-wide accepted architectural best practices and generates a report for you to see in which domain of the well-architected framework your existing system stands and also shares recommendations on how you can improve on the discovered shortcomings.
The assessment covers the five pillars of the well-architected framework:
- Performance Efficiency.
- Operational Excellence.
- Cost Optimization.
There are over 100 complex insight rules that dashbird uses to figure out architectural improvement opportunities in your existing architecture. Checks are categorized by criticality and vertical, giving users a structured overview of the findings and a clear overview on a single pane of glass.
Insights are automatically generated when:
- Anomalies, making your infrastructure is likely to fail, are detected (such as increased latency, error rate, or being close to a memory limit)
- Our system identifies opportunities for improvement (such as unused resources or lack of security practices)
All of the checks are also published in the Events Library, with details on intervals, conditions, reasoning, and for some insights, remedy steps.
Error Tracking and Alerting
Dashbird helps users track errors in real-time and receive proactive alerts by email, Slack, webhooks, and SNS immediately when issues are detected in your serverless stack.
Dashbird has automated issue detection algorithms so that developers don’t have to worry about what they should monitor.
Dashbird automatically detects all types of application errors and exceptions, in every runtime supported by AWS Lambda: NodeJS, Python, Java, Ruby, Go, .NET. We also monitor errors related to the Lambda platform and its limits, such as timeout, out-of-memory error, etc.
Other cloud components also have their own set of monitors. For example, SQS queues are checked for a growing number of pending messages, DynamoDB tables have throttling and resource capacity consumption verified, ECS containers have resource-usage tracked (e.g. memory, CPU utilization level).
Dashbird also integrates with AWS X-Ray, so that AWS Lambda functions’ logs can be analyzed in connection with application traces and errors in a single interface.
With Dashbird you can monitor each function’s behavior with customized policies based on performance and resource usage.
For example, an incident can be raised when one or more Lambda functions start using more than 90% of memory, on average, and the situation persists for a period of 15 minutes.
Under the Hood your Serverless Stack
Dashbird requires zero instrumentation (no code changes) and you can start working with your data immediately after connecting your AWS account – Dashbird only requires read-only access, so your data is safe and secure within your Amazon account. Lambda costs, execution time, and speed, as well as latency, will not be affected.
It takes only 2 minutes to connect Dashbird to your AWS account privately. Dashbird will then start automatically monitoring your whole serverless stack and CloudWatch Log groups for Lambda logs, and stitch your whole serverless stack together on the Dashbird app.