As soon as you switch your backend development to serverless, you will start facing an issue called Cold Start: Your API is awesome, scalable and everything and normally also very fast. Only now and then it seems it takes ages to respond ( well, from the user perspective 10 seconds are actually ages).
For its own nature a serverless application is not always running and after it has been idle for a bunch of minutes, the container is shut down. The time needed for the container to be restarted and all the components of the app to be reinitialized is longer than simply executing your code. That is the Cold Start.
One of the approaches to reduce cold starts ( even though this is not really solving the problem though, read im-afraid-youre-thinking-about-aws-lambda-cold-starts-all-wrong article from Yan Cui if you want to know more) is keeping your Lambda warm.
That means having a scheduler that invokes the function at regular intervals to prevent the container to be destroyed.
As documented by Serverless you can quickly configure a ScheduledEvent or you can configure Serverless WarmUp Plugin
What is the difference?
Schedule creates a CloudWatch Event that triggers your Lambda with the rate or cron job that you specified.
This means that for each Lambda in your serverless.yml you need a configuration for the Scheduling Event like this
- schedule:
name: your-scheduled-rate-event-name # optional
rate: rate(10 minutes)
This can get pretty messy to maintain - especially if you chose the Multiple Endpoints approach (where each Resource points to a specific Lambda), and it also means that every Lambda will have its own CloudWatch event, for which it seems there is a per account limitation on AWS:
Each AWS account can have up to 100 unique event sources of the CloudWatch Events- Schedule source type. Each of these can be the event source for up to five Lambda functions. That is, you can have up to 500 Lambda functions that can be executed on a schedule in your AWS account.
CloudWatchEvents
WarmUp also makes use of a CloudWatchEvent but that event is bound only to a specific Lambda created on purpose. This Warmer stores an array of all the Lambdas from your serverless.yml and when triggered by CloudWatch will, in turn, invoke all them.
In both cases the payload in the hander will specify the origin of the invocation so that you can exit immediately and avoid useless calculations or errors.
Just remember to check for the event source like this for the WarmUp Plugin
module.exports.handler = async(event, context, callback) => {
/** Immediate response for WarmUP plugin */
if (event.source === 'serverless-plugin-warmup') {
return callback(null, 'Lambda is warm!')
}
// do your stuff
}
and like this for the Schedule Event:
module.exports.handler = async(event, context, callback) => {
/** Immediate response for Ping from CloudWatch */
if (event.source === 'aws.events' && event["detail-type"] === 'Scheduled Event') {
return callback(null, 'Lambda is warm!')
}
// do your stuff
}
In our project we went for the Plugin since it allows easier and cleaner global configuration.
Our serverless.yml ended up looking like this:
custom:
warmup:
default:
- production
# lambdas in dev and staging environment can even freeze... who cares.
schedule: 'cron(0/20 8-18:30 ? * MON-FRI *)'
# since our API is used only from an internal company WebApp keep lambda warm only every 20 minutes in office hours.
prewarm: true # Run WarmUp immediately after a deployment
concurrency: 2 # Warm up 2 concurrent instances
When your run ´sls deploy´ make sure the plugin reads the configuration and logs the Lambdas that will be warmed and pre-warmed:
Afterward, you can check in your AWS UI Console and along your Lambdas, you will find another Lambda with description Serverless WarmUP Plugin. If you open its code you will see all your lambdas that will be warmed and you can also make sure the CloudWatch Event is there and properly configured: Schedule expression: rate(5 minutes)
.
If in the terminal you notice then something went wrong. Like in my first attempts: in fact, I was following the instructions on Github repo and from this article and found that info was something conflicting in the configuration of each function and that the global configuration wasn't working at all.
Then I found this issue and realized that the plugin has currently issues in releasing new versions and the Code and README on Master on Github is more up to date that the version on NPM, so the configuration must be done differently.
Until a new version is released just stick to the docs on NPM, then if you upgrade with npm or yarn remember to refactor its configuration in your serverless.yml
Hope this helps!
Top comments (9)
I don't get the entire concept of warming up your lambda's. It just feels useless. The cold start happens every time there's a new request for which there is no alive and free container.
The warming up solves only one case that is a lot more rare than the other ones, and even then it doesn't fully solve it.
Warm up is supposed to solve only the cases when you don't have a single alive container. So that you have at least one container periodically created/kept alive. But what kind of flows you have where you have zero traffic, and very rarely need to handle exactly one request, and you really care about performance of that one request?
And even then, it doesn't fully solve it. For example imagine this scenario:
You have no alive containers. The warm up trigger happens, so we are in the middle of a cold start, initializing a container. During that time, that one request comes, there are no free containers, there is only one being created and it has a request in a queue. So AWS will start a new container for this new request. You just had two cold starts. Your warm up trigger didn't do anything except for costing money.
The main area where cold starts happen is concurrent requests. If you have 10containers alive, and suddenly you will get 30 requests at the same time, you will have new containers created. thus a bunch of cold starts. Warm up does nothing.
You should investigate your traffic and find better solutions to mitigate the cold start issue. For example using one lambda for multiple things could work(e.g. graphQL endpoint), if you have certain endpoints that get called very rarely(let's say forgot password), and you don't want people to consistently have those requests slow, you can use one lambda to do either something that happens often and something that happens rarely, this way there will always be alive containers for the rare requests.
hi. thanx for your detailed comment. i totally agree with all you wrote. in fact, i am convinced that warming up lambda is a serverless antipattern and the points you mentioned are very well explained by the article i linked at the top of my post [probably i should put that more in evidence] : theburningmonk.com/2018/01/im-afra...
Having said all that. under many circumstances, the warm-up is the defacto solution for at least minimizing some of the occurrences of coldstarts.
our current application is only used during office hours, its usage is pretty random but with very limited cases of possible concurrent calls. Employes were noticing and complaining that "sometimes" the app was very slow and keeping the lambdas warm from 9 to 18 MonFri ( only some of them in our case - we didnt really care about some specific endpoints/resources) "solved the problem".
Warmups are not a silver bullet nor the real effective solution. I wrote this post not to be an advocate of warmups but to show how it can be achieved and highlight the differences between the 2 approaches. :-)
Oh Yan Cui blog post, some great points. I actually didn't consider an use case where you would trigger concurrent warm ups in preparation for a predictable request spike.
And the plugin supports concurrent warm up, that's actually really nice.
Can you have multiple warmup configurations though? E.g. I know that people ask for reports in the morning, so I want to warm up X instances for the start of the day. But then everyone looks at memes during lunch, so I want to warm meme related functions at different hours and different concurrency. Can I have named warmup configurations and assign them to different functions, or I would need to have some kind of logical separation through different serverless.yaml files?
yes. with the serverless plugin configuration you can define a global setting but also a resource based setting. so yes you could specify different schedules, and different number of concurrent instances for the endpoint you like :-)
yml is same. no need for multiple files. just specify the warm-up config under your function description ( assuming you have different lambdas for different endpoints)
I went down the warm-up path and noticed that I was still getting random spikes on lambda calls. I opened a case with AWS and they definitively told me that they cannot gaurantee a minimum amount of time that they will keep a container alive. They then suggested that lambda might not be for me and I should use EC2 or ECS to host my API. Have you found that warming up actually works with no spikes? It should be noted that I am only investigating APIGateway and Lambda, so the only traffic to my environment is the Scheduled Event Warmup Triggers and my occasional use of my test app. I see it is shutting down containers in under 5 minutes quite often.
yes, you might get spikes despite the warmup if you get concurrent calls. ( you can keep a container up but if your lambda needs to scale up, then when the new container is spun up you will get a cold start.
though, I wouldn't say EC2 solves all the problem, because if it's true that the lambda is always available and you don't have those spikes. you have to think about provisioning the instance. and you might end up paying a lot just to have the instance running even if used once every hour. ( in the end, it's up to costs and benefits. How much are you willing to pay to avoid a cold start (and to have a automatic scalable system)
I recognize all that. I think it is a bummer that they can't say definitively that they will keep a container open for X minutes after its last call. Without that, it is really hard to forecast what your recources will look like. With a startup type app that is getting hardly any traffic, almost every call is going to be a cold start. Your app will look extremely slow and your users are just going to get mad at the app. Oh well... On to experimenting with Fargate and ECS.
While I upgrade the node for AWS lambda from 6.10 to 8.10 and the warmup plugin to 4.0.0-rc.1 I am getting the error
Serverless Error ---------------------------------------
Serverless plugin "serverless-plugin-warmup" not found. Make sure it's installed and listed in the "plugins" section of your serverless config file.
Get Support --------------------------------------------
Docs: docs.serverless.com
Bugs: github.com/serverless/serverless/issues
Issues: forum.serverless.com
Your Environment Information -----------------------------
OS: linux
Node Version: 8.10.0
Serverless Version: 1.35.1
Need your insight here.
Hi, I got yesterday a similar issue with another plugin, but error was the same. In the end - at least in my case, it was caused by the version of serverless installed on the machine. You can update the version you have installed globally or install it as DevDependency in your project (but then beware if your just run sls deploy it will still use the global one - different story if you run npm run deploy - pointing to a script in your package.json)
hope it helps