Dashbird Webinar: Lambda monitoring best practices with Q&A session - Register here!

Optimizing AWS Lambda for Production

Serverless was the MVP of last year and I’m betting it’s going to play a bigger role next year in backend development.

AWS Lambda is the most used and mature product in the Serverless space today and that’s why I’m going to share some tips and best practices for building production-ready Lambdas with optimal performance and cost.

By the end of this article, you will have a workflow of continuously monitoring and improving your Lambda functions and getting alerts on failures.

1. Tracking Performance & Cost

Before making any changes to your functions, I recommend setting up performance tracking for your functions. Monitoring invocation counts, durations, memory usage and cost of your Lambda functions allows you to pinpoint issues and make effective decisions. I recommend using Dashbird, since it’s easy to set up and the relies only on CloudWatch logs, meaning the service itself won’t slow down your functions (or up your AWS cost).

Optimizing AWS Lambda for Production

In the above image, we can instantly notice some quickwins:

  1. alpha-prod-client-error-processor has the most invocations, meaning it has the highest potential of optimization.
  2. The same function has the most errors, so it’s reliability can be improved.
  3. alpha-prod-ping and alpha-prod-client-metrics have memory overprovisioned and can be optimized towards cost.
  4. Some of the functions have invocation times over 20 seconds. This is slow and probably yields and optimization opportunity.

So, let’s dive in the different strategies to turbocharge your functions…

2. Optimal memory provisioning

The amount of CPU allocated to your Lambda function is relative to the memory provisioned for that function. A function with 256 MB of memory will have roughly twice the CPU from a 128 MB function. Memory size also affects cold start time linearly.

Taking into account the cost increase of more memory, developers have a choice to optimize either for speed or cost.

Here’s a good example of from cost and speed optimization by Alex Casalboni:

RAMDynamoDBSNS Message sendingBilled timeCost
128MB110ms120ms300msCheapest & slowest
1024MB80ms70ms200msMost expensive
1536MB35ms45ms100msOptimal

In terms of cost, the 128MB configuration would be the cheapest (but very slow!). Interestingly, using the 1536MB configuration would be both faster and cheaper than using 1024MB. This happens because the 1536MB configuration is 1.5 times more expensive, but we’ll pay for half the time, which means we’d roughly save 25% of the cost overall.

3. Re-use Lambda containers

AWS reuses Lambda containers for subsequent calls if the next call is within 5 minutes. This allows developers to, for instance, cache resources and implement connection pooling.

Below is a checklist that you can go through when thinking in that direction:

  • Store and reference external configurations and dependencies locally after first execution.
  • Cache reusable resources with LRU methods.
    • Beware of the provisioned memory here, which is 1536MB at most!
  • Limit the re-initialization of variables/objects on every invocation. Instead use static initialization/constructor, global/static variables and singletons.
  • Keep alive and reuse connections (HTTP, database, etc.) that were established during the first invocation. Here’s some material on how to do that:

Avoid using recursive code in your Lambda function, wherein the function automatically calls itself until some arbitrary criteria is met. This could lead to unintended volume of function invocations and escalated costs.

4. Log-based monitoring and error handling

With servers, collecting performance metrics and tracking failed executions is normally done by an agent that collects telemetry and error information and sends it over HTTP. With Lambdas, this approach can slow down functions and over time add quite a bit of cost. Not to mention the extra overhead that comes from adding (and maintaining) third-party agents over possibly large amounts of Lambda functions.

The great thing about Lambda functions is that all performance metrics and logs are sent to AWS CloudWatch. In itself, CloudWatch is not the perfect place to observe and set up error handling, but there are services that work on top of it and do a good job on providing visibility into your services.

Dashbird is a log-based monitoring and error handling solution for AWS Lambda functions, which is perfect for observing all layers of your Servelress architecture and getting alerted for all failures that can happen to you service.

Conclusion

There’s a lot of room to optimize your Serverless stack and it all starts with knowing the right ways to do it and being able to locate the issues. I reccomend following all of the instructions above and testing the difference in performace after doing changes to you Lambda functions. Keep in mind that performance is especially important in API endpoints and functions that have high execution volumes.

I hope you enjoyed reading this article. If you have some tips that were not mentioned here, don’t hesitate to let me know in the comments.

Keep up to speed on the latest serverless trends!

Write for us!

We're looking for developers to share their experience with Serverless.

Emails and pull requests welcome!

Start using Dashbird for free!

Failure detection, analytics and visibility for serverless applications in under 5 minutes.

Request Demo