How to handle AWS Lambda cold starts

Cold starts and AWS Lambda go hand in hand. But are there any ways to reduce the impact of cold starts?

Smarter people than myself, once said that you should do things you don’t like first so that you can enjoy the things you’re doing later. Nobody wants cold starts. They’re annoying, and we continuously feel an itching sensation in our brains because of them. The serverless world would be a much better place to work in if they weren’t there. A great serverless advocate said something really smart about cold starts:

You can’t live with them , you can’t live without them – Britney Spears reffering to coldstarts

Basically what that means is that in order to scale your application as seamlessly as Lambda does, new containers need to be created. But as you can imagine, there’s a finite number of containers that can work at any given time so old ones need to be deleted to make room for the new ones. That’s how those cold starts happen. Whenever you create a new one, it will take some time for it to spin up or warm-up, hence the term cold start.

Basic Fighting Rules

The general impression is that cold starts are not very high after all, meaning that most of the applications can tolerate them just fine. If the situation is not being that kind on you, there are some ways to keep the function instances warm enough which will reduce the cold start frequency. This approach is the same for all providers. Once in “x” minutes, you should perform an artificial call to the function so it’ll prevent its expiration. Details of this implementation can be different because the expiration policies are different as well. You should fire up several parallel “warming” requests to make sure that enough instances are kept in the warm stock which applies to applications with a higher load profile.

Cold Start Duration & Frequency Reduction

It is possible to cut down the time impact of cold starts if we write the functions by utilizing interpreted languages. Cold start latency with Python or Node.js is well below a second. A compiled language such as Go is yet another example of low cold start latency. Choosing higher memory settings for your functions is another way to go, and it will provide your PC with more CPU power. It is essential to avoid VPCs. VPCs are required to create ENIs, which need more than 10 seconds to initialize.

By keeping your function warm, you’ll reduce the cold start frequency. Simply, by doing this, you’re actually sending scheduled ping events to your functions to keep them idle, and ready to serve requests. Amazon CloudWatch Events allows you to trigger the functions in certain time periods so you’ll have a fixed number of AWS Lambda instances alive on a constant basis. Setting up a “periodic cron job” will trigger your function every 5 to 15 minutes, and it will stay idle.

Is Handling The Concurrent Cold Starts Possible?

You can choose between different kinds of plugins and modules to utilize in this case. Lambda Warmer for Node.js will allow you to warm the concurrent functions while enabling you to choose the concurrency levels you wish. Lambda Warmer is compatible with both AWS SAM and Serverless Framework that has another plugin by the name of Serverless WARM-Up Plugin, and it doesn’t support the concurrent function warming.

Adding Lambda Warmer to your functions is simple. The call itself looks something long these lines:

const warmer = require('lambda-warmer')

exports.handler = async (event) => {
  // if a warming event
  if (await warmer(event)) return 'warmed'
  // else proceed with handler logic
  return 'Hello from Lambda'
}

Some ways will allow you to warm up your functions properly. Consider producing a handler logic that won’t run all function logic while the warming is running which can be of great help, but you should consider not to invoke the functions more frequently than once per every 300 seconds. While you invoke your function, do it directly via Amazon CloudWatch Events.

Cold Starts Within Dashbird

After a short and straightforward sign-up process you’ll be able to immediately observe and work with detailed data on your Lambdas, including the last invocation status and specify cold start filtering.

We already know that running a serverless infrastructure has significant benefits. It can save you a lot of money and time, but it can also put you in front of problems like cold starts and latency. But there is a way to handle them via Dashbird.io tool which will allow you to get out the most of your serverless application.

Dashbird’s tailing functionality offers a near real-time insight into the functions you’re running, and it provides you with all the necessary logs and metrics which are needed for the mentioned invocation. But an option really worth mentioning that comes with Dashbird is that it can detect cold starts and retries in your Lambda invocation. Therefore, you’ll be able to see in your Lambda invocation list: which invocations have been retried as well as which ones became cold start invocations. It makes quite a big difference in navigating serverless.

Where does that leave us?

Cold starts are currently one of the biggest serverless issues with no permanent solution, and we can only hope it’ll be figured out as soon as possible so that we could do our magic without the stress involved. We’ve figured out some ways how to fight against it on different fronts, and that’s all we have for now until the providers give us a permanent solution.

Until then, we’re able to share our experiences, ideas, and thoughts which we highly recommend to all our readers. If you think you have cracked it, let us know about your solution in our comment box below. We should discuss it since only together we can try and solve this puzzle. Or not?

Read our blog

How to Monitor Your AWS RDS Instances

In this article, we’ll cover all the steps for creating proper monitoring for your RDS instances by starting with metrics and performance guidelines.
We will also compare the monitoring options offered by AWS with Dashbird’s simple but nevertheless all-encompassing approach.

Why and how to monitor Amazon API Gateway HTTP APIs

Monitoring your HTTP APIs can transform your decision process with actionable information instead of guessing around user complaints and high bills.
This article will look at why and how to monitor HTTP APIs and how Dashbird will help you do that. 

5 Common Amazon Kinesis Issues

AWS Kinesis is a professional tool that comes with its share of complications. This article will discuss the most common issues and explain how to fix them.

More articles

Made by developers for developers

Dashbird was born out of our own need for an enhanced serverless debugging and monitoring tool, and we take pride in being developers.

What our customers say

Dashbird gives us a simple and easy to use tool to have peace of mind and know that all of our Serverless functions are running correctly. We are instantly aware now if there’s a problem. We love the fact that we have enough information in the Slack notification itself to take appropriate action immediately and know exactly where the issue occurred.

Thanks to Dashbird the time to discover the occurrence of an issue reduced from 2-4 hours to a matter of seconds or minutes. It also means that hundreds of dollars are saved every month.

Great onboarding: it takes just a couple of minutes to connect an AWS account to an organization in Dashbird. The UI is clean and gives a good overview of what is happening with the Lambdas and API Gateways in the account.

I mean, it is just extremely time-saving. It’s so efficient! I don’t think it’s an exaggeration or dramatic to say that Dashbird has been a lifesaver for us.

Dashbird provides an easier interface to monitor and debug problems with our Lambdas. Relevant logs are simple to find and view. Dashbird’s support has been good, and they take product suggestions with grace.

Great UI. Easy to navigate through CloudWatch logs. Simple setup.

Dashbird helped us refine the size of our Lambdas, resulting in significantly reduced costs. We have Dashbird alert us in seconds via email when any of our functions behaves abnormally. Their app immediately makes the cause and severity of errors obvious.