Currently Dashbird uses logs for everything. This gives a real-life overview of things happening with your lambda functions and a possibility to deep dive into a single execution and see all logs from there.
Fetching logs is also the cheapest way to gather data, costing only 0.010$ per GB.
Taking this approach, Dashbird experience can be really low friction and high granularity, with a price of slight delay. Users don’t need to make any code changes but still get a really good overview, with the possibility to go very deep into a function, seeing all the need data.
Learn more about all the benefits and features Dashbird offers.
Datadog uses StatsD protocol and metrics to capture information about lambdas. StatsD (or DogStatsD) is a time series metrics protocol where you can push custom events (like tasks added N) and then show data in Datadog environment. Using StatsD protocol means that you have to use Datadog library in your code and push the metrics manually.
Datadog is doing monitoring that is based on metrics and that means they are fetching all the required metrics for lambda functions and then showing them on their dashboard. This approach is also low friction, but has a really low granularity (metrics are gathered and aggregated to 1 minute intervals). It’s a really good way to get a nice overview of your lambda functions, but you won’t see the real problems.
Metrics also have the cost issue. Fetching metrics costs 0.01$ per 1,000 requests. One lambda function has 5 metrics (invocations, durations, throttles, errors, concurrent executions), this means fetching all metrics for one lambda function will cost the user (56024300.01 / 1000) 2.16 USD per month. This is quite a high cost given that you only get really high level overview of ONE lambda function. We have learned that the average size lambda stack contains of 60 functions, costing the user average of 60*2.16 = 129.6 USD just to see a really high level overview of their lambdas (without the possibility to dive deep into the errors and fixing them quickly).
IOpipe uses their own library to send data to clients. This essentially means that the client needs to add IOPIPEs code into their own codebase. It brings a certain overhead to lambda functions and that can potentially be bad.
The good thing about IOpipe’s approach is that they instrument the runtime the lambda function is running inside. Meaning that you will get a deep technical overview of your invocation, which can help the users to understand and fix potential bottlenecks with their functions.
IOpipe does not automatically fetch lambda invocation logs, however they have a functionality to log through their system. Which is not ideal. It’s one thing to attach a library to instrument data to your code base, it’s another thing to rewrite logging to fit that instrumentation.
That being said, their approach of sending you data means that they have no potential max amount of data they can receive. The approach that IOpipe has taken, can be used with unlimited amount of invocations, assuming that their backend can handle the load.
IOpipe’s approach also means that they can only support a few languages, currently they support Node, Python and Java.
Note: Given that all these tools are in continuous development, please let us know if any of this information is not up-to-date.