At onoranzefunebricloud.com we provide funeral agencies with digital solutions to do their best work, and to build such complex tools we use a combination of AWS services including dockerized applications in Fargate and lot of Serverless tools such as Lambda.
Over the years, as we introduced more functionalities we've also grown our usage of Lambda, and as of today we count over 100 Lambda functions in production.
In this post I want to share the lessons learned, the good and the bad.
I've organized the content in 4 sections:
- The good
- The bad
- Recommendations
- Conclusions
The good
Zero daily operations & no downtime: Over 5 years we never had a downtime on the services built on Lambda. The product has been very reliable for us and we are always confident it will scale for us, when needed, without any manual operation on our side. Wait! Lambda doesn't scale indefinitely! There are soft and hard limits to keep in mind, for example, by default you can have up to 1000 concurrent Lambdas running in a given AWS region at any given time. We have a predictable work load, that's why I have this confidence. Check out Lambda quotas to know more.
π° Affordable: For a long time we were paying $0! Until our startup got traction, our AWS bills have been very low thanks to using Lambda and the generous AWS free tier.
Tiny codebase β‘οΈ Easy to contribute to: Having services separated into each own repository and its own set of Lambdas in our experience has simplified and made easier to contribute to a project. The learning curve of the codebase is very small because the codebase is tiny!
Separation of concerns: Having services deployed as an independent Lambda means for us having decoupled components which meant individual contributors felt safe in modifying, deploying a service without the worry of breaking another one. If something went wrong, only an isolated service would be affected and it would be easier to debug.
Versatile: Lambda is very flexible in terms of what you can do with it. In a recent blog post I've even shared how I am using a runtime with multiple programming languages.
Β The bad
Too many Lambdas: Having too many Lambdas to update could result in a maintenance burden. For example, some of the first Lambdas we built back in 2017 were running in Node.js 14 which AWS has deprecated. We were not planning to work on those, so for us that was unplanned work that we had to undertake to avoid being stuck and blocked in making new changes to those Lambdas.
Expensive if misused: Lambda is not a silver bullet, in fact, if misused it can lead to very expensive bills. Using it for high-load or long-running tasks, it will definitely make you incur in expensive bills. Choose carefully how to use it.
Configuration Expert: Lambda has evolved a lot in the past 5 years. The initial concept: "take this code and run it" is slowly fading away over time, due to the addition of tons of configuration. You can still get started pretty quickly, but you can also get lost in fine tuning or configuring your function.
Recommendations
π― Choose, with care
Do not default to Lambda for everything. This is a mistake I did at the beginning, building REST APIs on Lambda might not be the best decision, especially when you grow and scale, you might regret it.
Lambda is great for gluing together components, building event-driven systems and much more, but for long-running and intense compute processes, it might make sense only in certain scenarios.
π Get paged when it fails
Your code will fail and you want to know when that happens. We use serverless-plugin-aws-alerts
to instrument CloudWatch alarms and dashboards for our Lambdas and OpsGenie to page us on Slack or on our phones.
You need to configure OpsGenie to use SNS and then the simplest approach is alert on failures, for example serverless.yml
:
custom:
# Lambda Alarms
alerts:
stages:
- prod
dashboards: false
topics:
alarm:
topic: arn:aws:sns:${self:provider.region}:${aws:accountId}:opsgenie
definitions:
functionErrors:
namespace: 'AWS/Lambda'
metric: Errors
threshold: 1
statistic: Sum
period: 60
evaluationPeriods: 1
datapointsToAlarm: 1
comparisonOperator: GreaterThanOrEqualToThreshold
treatMissingData: missing
alarms:
- functionErrors
This alert is quite aggressive, it will page you as soon as there's an error, you can tune that to your preference.
π οΈ Debug locally
We use VSCode and Serverless to run and debug our code as if it was running in Lambda. Example .vscode/launch.json
:
{
"version": "0.2.0",
"configurations": [
{
"type": "node",
"request": "launch",
"console": "integratedTerminal",
"name": "hello",
"env": {
"AWS_PROFILE": "<YOUR_PROFILE>",
"SLS_DEBUG": "*",
},
"program": "${workspaceFolder}/node_modules/.bin/sls",
"args": [
"invoke",
"local",
"-s",
"dev",
"-f",
"hello",
"-p",
"./examples/event.json"
]
}
]
}
In ./examples/event.json
you can store your test event, such as API Gateway or CloudWatch payloads.
Another approach, can be using OpenFaaS but this is a tool that I didn't have the opportunity to try yet.
π¦ Keep it small!
We use serverless-esbuild
to reduce the bundle size, it makes deployments and cold time faster.
π Stay up to date
You will find having up to date knowledge very handy the next time you're taking an important decision for example using or not Lambda or any other AWS service. I recommend following Yan's work on Twitter and his website theburningmonk.com.
Another valuable asset is AWS This Week by the folks of ACloudGuru.
Conclusions
Lambda is a great tool if leveraged correctly, at the same time it's not a silver bullet.
I suggest you to listen in to some podcasts and read few more blog posts if the concept is still blurred.
Getting started if fairly simple, especially with the Serverless Framework
What are you waiting for? Give it a try.
Cover credits: https://unsplash.com/photos/Q1p7bh3SHj8
Top comments (6)
Hey Andrea, thanks for the callout!
On the "Too many Lambdas" point, I think it's worth pointing out that you're not updating individual Lambda functions. The thing you're update, is the no. of projects (the actual deployment unit from a developer point-of-view), which in the case of people using Serverless framework, is the number of projects each with 1 serverless.yml file (and specifically, 1 line in that yml).
This is probably no different to if you were building your system with containers, where if you need to update from node14 to node16 across your entire system. You'd have to touch the same no. of docker files.
I think what is actually different with Lambda is that the update is enforced upon you externally. But it's also worth remembering that, the deprecation process is phased, and your function would continue to work even if you don't update, you just can't update the function until you do. I would argue that this external force is a good thing in the long run. How often do you see teams put off important updates (that often impact security and performance) for years when it's left to their convenience? The truth is, there is never a convenient time to do these updates. And as much as it sucks to have unplanned work added to your sprint, you generally have months to update your functions when a deprecation notice for your runtime is issued because of the phased rollout.
Hey @theburningmonk! Thanks for reading and adding value to the comments!!
I agree with you, there is never a good time for these updates and in the long run they are good to have.
Some of us (including me) work on the project in their spare time, so you can imagine how hard it was to find time to do these updates, considering all the other things we have to do.
We had a npm dependency for parsing XML which was super old and wouldn't work with Node.js 16!
I am thinking now that perhaps it would have been easier using a custom runtime with a Dockerfile and stay on Node.js 14 - at least to be able to decide our own deadline :)
You know you could just leave it be, right?
They stop you from being able to update the function, but it'll continue to work past the last deprecation date. So if it's a side-project and you don't need to keep it up-to-date, that is always an option. Just delay until you absolutely have to do it. Who knows, maybe your dependency have been updated by then!
I do, however as you said you wouldn't be able to update the function and despite this being a side-project it has many paying users on it and this specific service was on the critical path!
And we didn't want to be in the position of not being able to make an hotfix.
I see. Are there no other packages you can use to parse XML?
Seems like a big burden to create a custom runtime just for that... not to mention you're sacrificing security because you won't be getting Node.js runtime updates, and you're responsible for a new layer of security
At the time of the deprecation deadline there was not "drop-in replacement" library, without doing some refactoring.
The XML ecosystem for Node.js is very outdated, often the libraries are wrappers/binding to C++ libraries and not very maintained (not a surprise considering XML is not the cool kid anymore).
Regarding the custom runtime, I don't advise anyone doing it but it's an option that comes with risk, as you highlight!
At the time we didn't have any developers in the team besides the founders and you have to keep scaling the business at the same time, so you might be willing to accept a bit of risk or tech debt!