What are cold starts, why they happen and what to do about them
When a function is invoked, Lambda checks whether a microVM1 is already active. If there’s an idle microVM available, it will be used to serve the new incoming request. In this particular case, there is no startup time, since the microVM was already up and had the code package in memory. This is called warm start.
The opposite - having to provision a new microVM from scratch to serve an incoming request - is called cold start.
The total startup time depend on multiple factors. As a general rule, these are the most important ones:
Cold starts add up to the overall execution time. For time-sensitive workloads, this can be a problem.
The occurrence of a cold start will depend a lot on the variability of the application demand. For frequent and low variability traffic, cold starts will hardly be an issue. This is because the application will require the same number of microVMs most of the time. And since traffic is frequent (new requests every minute for example), Lambda will find warm microVMs available for most invocations.
Applications that present infrequent or highly variable traffic demand, the likelihood of cold starts increase considerably. Infrequent access means Lambda will terminate microVMs after too long idle periods. And high variability increases chances of multiple concurrent requests, which may require spinning up microVMs from scratch.
A simple solution is invoking functions on a scheduled basis (e.g. every 10 minutes). This will make Lambda keep some microVMs alive all the time. Developers will commonly need to ensure warm starts for multiple concurrent requests. The scheduled process will need to handle multiple invocations in parallel in order to force Lambda into keeping multiple microVMs alive.
Beware that the warming scheduled invocations will be charged normally as any other Lambda request. Since there’s no need to process anything actually, the function can terminate right after invoked, reducing the cost of the warm-up process.
Another approach is using traffic prediction modeling. By anticipating how many requests are likely to be received in the next 30 minutes, for instance, it’s possible to adjust the scheduled invocations. This would also contribute to keep warming costs down.
There are open source projects to help with those two approaches:
Save time spent on debugging applications.
Increase development velocity and quality.
Get actionable insights to your infrastructure.