Lessons Learned — A Year Of Going “Fully Serverless” In Production
Photo by Aaron Burden
At Torii, we decided to go the no-ops path as much as possible meaning we’ll focus all our efforts on our product and not on operations. While we enjoy doing devops, that’s not our main focus as a company.
Serverless, because we like to sleep at night.
We can break our application into three parts:
Static websites. These are frontend websites, written in React and statically generated at build time.
Background jobs. These are jobs that are scheduled or triggered by events such as file uploads, webhooks or any other asynchronous event.
**API Server. **A REST API server interacting with our databases and serving all client requests.
Lessons Learned
#1. Static websites
Static website are fast, easy to scale and simple to distribute. We use React to build our frontend and the code is packaged as a simple HTML/JS/resources bundle ready for distribution.
We use Netlify to host these static assets on a CDN and get fast loading times from anywhere in the world.
No Nginx/Apache servers to configure here 👍
#2. API Server on Serverless
The basic idea is that an API Server is a function: the input is an HTTP request and the output is an HTTP response. It’s perfect for FaaS (Function as a Service) where each HTTP request gets its own server instance handling it.
This setup leads to automatic scalability, high availability and reduces costs dramatically. It also makes things simpler since there are less moving parts: no servers, no load balancers, no auto-scaling groups. All these are abstracted away and all we care about is one function.
We take an entire Node.js app and package it as a single AWS Lambda function. An API Gateway routes all traffic to it and the Node.js app sees it as a regular HTTP request.
We picked apex/up for setting up the stack, updating it and deploying our functions. It’s really as simple as writing up in your terminal. It is highly configurable, so you can customize the deployment for your needs, but if you have no special requirements, the default is good to go.
Zero servers to provision, configure or apply security patches to 👏
#3. Packing for Serverless
Deploying a Lambda function has a 52Mb limitation of the function including all of its dependencies. If you’ve coded a decent Node.js project recently, you’ll know we can pass this limit easily. Note: There’s a way to deploy it from S3 which allows to bypass this limitation, we haven’t tried that yet.
To mitigate this, we‘re including only the required dependencies and trimming their size by excluding unused files like READMEs, history of package, tests, documentation and examples. We published a package that helps do this named lambdapack. It will pack your code with webpack to provide with the latest Node.js and JavaScript features, while keeping your node_modules as small as possible. lambdapack fully integrates with apex/up so the build process is optimized and packed efficiently.
Read more about lambdapack on GitHub.
#4. Deployments
This works amazingly well, where each deployment creates a new version of the Lambda. AWS allows to keep multiple versions of each Lambda and have aliases pointing to versions. Popular aliases include: test, staging and production. So a new deployment means uploading a new version of the Lambda and pointing the production alias to it. Fortunately, up does this automatically with up deploy production. Rollbacks are just aliasing the pointer to the required version.
#5. Local testing/development
Since we are using a regular Node.js server, running locally just means running your server as usual. However, this doesn’t mimic the AWS infrastructure with all the important differences like: enforcing the same Node.js version, API gateway timeouts, Lambda timeouts, communicating with other AWS resources and more. Unfortunately, The best way to test is on the AWS infrastructure itself.
#6. Background jobs
For background jobs such as file processing or syncing with 3rd party APIs, we keep a set of dedicated Lambda functions that are not part of the API server. These jobs are scheduled to run by CloudWatch or as a response to events in our system.
Currently we use a “sibling” project to handle these background job Lambdas — using the open source apex/apex.
These functions only run when needed and there’s no need to keep servers up to process these jobs. Another win for the Serverless approach 🚀
#7. Logging
AWS services comes with the build in CloudWatch logs service which has awful UI, UX and DX. While the up cli has a log feature to view the logs, there’s still much more to ask: alerts, aggregated logs, etc.
Our first solution was logging directly from the API server to a 3rd party logging service (we use papertrail), but this kept the Lambda functions always up.
A better approach is to stream the Lambda logs into a dedicated Lambda that is responsible for sending it to the 3rd party logging service. We used an updated version of cloudwatch-to-papertrail. I also suggest streaming the API Gateway logs to get the full picture.
#8. Environment variables and secrets
Don’t commit your secrets to source control. Now that we got this out of the way, we should store them encrypted somewhere. AWS has a solution exactly for this and it is called AWS Parameter Store. You add your parameters, choose whether to encrypt them or not and then choose who can read these secrets. We will allow our Lambda function to read these secrets as soon as it starts running. Since Lambda functions are re-used, this will happen only on the first invocation of the Lambda (First API call). To set this up, we add the parameters with an hierarchy of /{env}/env_variable, for example /production/MYSQL_PASSWORD. Now we can read all /production parameters and use them as environment variables or just store them in-memory.
#9. Performance and Cold starts
When a Lambda hasn’t been invoked in a while it will freeze and the next invocation will incur the time of launching a new instance of the server. This can take some time depending on the complexity of the app, sometimes between 600ms–2000ms. There’s currently no real solution for this other than (1) warming the Lambda (periodically calling it using a monitoring service or just another scheduled Lambda invocation using CloudWatch) and (2) making your Node.js app load faster. Hopefully, AWS will find a way to reduce the cold start time in the future.
If your API server has to comply with an SLA, Serverless at this point might not be a great fit 😞
#10. No parallel requests
When building Node.js servers, we’re used to handling multiple requests with the help of the event loop and asynchronous functions. However, when ran inside an AWS Lambda, each Lambda container will only handle one request.
This means that parallelism is achieved by the API Gateway spawning multiple Lambdas vs. one Node.js app serving multiple requests.
Test your app and use cases to see if this model fits.
Conclusion
Is Serverless a step forward in the operations space? With devops we wanted to understand how ops work while with Serverless we benefit from delegating the responsibility for operations to someone else (in this case AWS) and we can call it no-ops. While we lose flexibly, we gain a lot of features, ease of mind and ability to focus our energy on our code and product.
Serverless will surely take over in the next few years, including more specific serverless offerings like serverless databases, serverless streaming services and others.
For us developers, this is almost the holy grail. Build it, ship it, it works.
Originally posted at https://hackernoon.com/lessons-learned-a-year-of-going-fully-serverless-in-production-3d7e0d72213f
Top comments (4)
Hey Tal,
Great post, thanks for sharing!
We had the same timeframe (almost 1year on aws lambda)
I agree that it is best to test the apps on AWS infra itself. You might want though, to have a faster turnaround time on local dev using github.com/localstack/localstack
Hi Tal,
On #5 testing and local envs, I heavily recommend these docker images:
hub.docker.com/r/lambci/lambda/
I think they do a great job to replicate the lambda functionality.
Great post!
Hey why are you hacking all of my accounts dbag
Hello Tal,
I started working in Node.js backend deployed in an EC2 instance, then suddenly i was asked to move to LAMBDA in AWS. I didn't want to refactor all the code again so I did the same thing and everything has worked ok so far, although we are not in production yet. However I started breaking the API in a "entity or model basis" so at the end i will have 9 or 10 smaller lambda functions proxied through the same API GATEWAY ... but at the end the thing is you can keep coding as usual, even you can test locally an then test in AWS as you said ...