In the serverless community, individuals and teams spend a lot of time and effort attempting to build an environment that is a replica of the cloud. Why? Because this is what we have always done. When you start your career building applications for the web, we were told you need to have a local development environment on your own machine and you do your work against that environment before pushing to your code repository.
But I am going to argue that this absolute requirement to get up and running when building applications is not only unnecessary in the serverless world but actually harmful.
Lets start considering the whys. Why do we create local development environments in the first place? What purpose do they actually serve?
If you look back at where we have come from building for the web, we used to exist in a world where our code and scripts were exceedingly minimal and work was essentially done directly on the machines that served our application to the web. Why? Because these machines were often very specialised ones that were impossible to replicate without great expense and aiming for 100% uptime was not necessarily the biggest goal at that stage so why not? Its easy to just edit files directly on that remote machine.
Push things a few years later down the line and we are now in a position where we need to make changes multiple times a day to an application that must not go down if we can avoid it. Editing directly on production becomes scary because we would like to test this application first if we could.
Luckily, at this stage, a lot of the infrastructure for the web has gotten commoditised; we can use a regular consumer computer and install the same (or similar applications) to it to simulate the remote environment and test our application before pushing to the production server.
However, things couldn't stay this way. Traffic increased, and single machines soon no longer became enough to handle the load the growth of the Internet created. Clusters of machines were needed with comparatively complex architectures to both increase request throughput and resiliency to failure as downtime became more and more costly. No longer was the replicated development environment on a developers machine a pretty-close replica.
This is where a lot of the staging or development environments begin to come from. The thinking is, let developers develop on their local machines as they have done because that's what they are used to, and we will spin up as close to a replica of production we can in order to test against to make sure it wont break anything,even if its costly to the business, because that's better than down time.
The cloud certainly helped a lot in this as well; if you can create staging environments on command and only put them up when needed, its not quite as expensive as keeping a development cluster in parallel in a server rack.
However, the issue is that our local machines were, at best, only occasionally accurate to the production cluster, and usually required developers to be constantly pushing code to the shared staging server for testing purposes as the architectures were just too complex to ever hope to replicate locally and made any kind of local testing redundant. Not to mention, in teams, this resulted in a lot of stepping on toes and waiting for your turn to test your changes!
What was really needed was a replica of production for every developer in the team. But with production clusters running multiple virtual machines, load balancers, relational databases, caches, etc, this is cost prohibitive.
Then containers arrived. Finally! Now we can package up the complexity of our production systems into neat little blocks that don't interfere with each other and we can get closer to production by running them on our own development machines.
Except, they do interfere with each other, and added huge amounts of complexity for developers to have to handle and worry about. Expensive engineers should be building features and generating revenue instead of managing their development environment and it STILL wasn't as accurate a representation of the production environment it should be!
At one point, I was an engineer for an e-commerce organisation and they siloed a single developer off for two months to replicate production as a collection of docker containers we could just install on our machines. The end result was a process that took 30 minutes just to install and required the entire development team to have their hardware upgraded to at least 16 GB of RAM. Running Nginx, ElasticSearch, Redis and MySQL on a single machine apparently uses a lot of memory; who would have thought. And we STILL had constant issues when we thought our code was ready to be tested against the staging environment and it just wasn't.
This is just one example of many I have to share.
The TL;DR of the above? We used local testing because testing against production became too dangerous,tried to replicate production locally and failed miserably to today where we are, essentially, still testing against production
And now, in the world of serverless development, here we are once again, trying to make things run locally that really shouldn't. And this isn't a collection of virtual machines or docker containers we can kinda of get to run locally to some semblance of accuracy. These are cloud services for which most have no official way to run locally and probably never will. The existing emulation techniques used in tools like Localstack are impressive but not an exact replica of the cloud; they are the best effort someone has made to allow us to kind of sort of test these services locally with something resembling the cloud version. Not to mention all the aspects of the cloud (and distributed application architectures) that can throw a spanner in the works. How do you replicate intra-service latencies, IAM, service limits and so many other aspects of the cloud that aren't related to a specific service
We also don't even need to! With tools like the Serverless Framework (I know there are others I have just not used them to the same level of familiarity as the Serverless Framework) that gives you the ability to deploy the exact same configuration of resources we deploy into production in any other environment we choose. Want a shared environment for the developers of the team to test against? Just run the deploy command! Want your own "local" environment to test against? Just run the deploy command!
Finally! We are in a position where we can 100% replicate the infrastructure in production and, because of serverless application's propensity to bill for usage, it costs you nothing to deploy them and pennies if you do testing against them!
So why are we still fighting so hard to maintain the local environment? Probably because of the feared lack of productivity. To answer this, I am going to point to a recently published post by a compatriot of mine at Serverless, Inc, who wrote up a great way to look at "local" development for serverless and the very few tools you need to accomplish this. Check it out here. The amount of time spent managing a local development environment, updating it, making sure it keeps running, is costly in itself. But there is another good reason to not consider it!
Its actually bad for your application!
Consider a group of developers using an emulation tool like Localhost. It does an ok job at allowing the developers of the team to build and test their serverless applications locally. However, one of the members on the team spots a really useful cloud service that could be used to build the best possible solution to a problem they are trying to solve. It can improve the reliability of the application as a whole, decreases costs and time to production. However, this service is not (yet) provided by the local emulation tool.
They now have three choices. Use the service anyways, meaning that testing in the cloud is now an absolute requirement but the application is now better for it. However this kind of makes the local testing environment entirely irrelevant. Or, don't use the service and essentially hamstring the efficacy of your application because the local testing environment is sacrosanct. Or lastly, spend days or maybe even weeks trying to find a way to replicate this service locally, delaying deployment of this feature and _still having a sub standard replica of a cloud service to test against, assuming you find a workable solution to begin with.
What about tools like serverless-offline? Nice and simple and lets you just easily test against your HTTP endpoints? Right?
Well, besides the fact that, yet again, this is not an accurate representation of the cloud and completely ignores the oddities of services such as API Gateway, IAM, etc, it is also only good for http events. More and more we see serverless applications doing more than just be glorified REST API's. You cannot test all the other events that can trigger your Lambda functions.
Local development seems, at face value, to be efficient and simple. It is a necessary evil in the traditional web development world because traditional architectures are too costly and unwieldy to replicate exactly for every developer of a team. But serverless architectures cost nothing to deploy and minimal (or often free) to run tests against, and can be exact replicas of production when deployed into the cloud.
Just because it is familiar doesn't mean its a good idea. With tools like the Serverless Framework and others out there offering the ability to deploy only code in mere seconds, invoke functions directly from your local machine to the remote Lambda and even tail the logs in your terminal to get instant feedback on errors, you do not need to lose productivity but can drastically decrease complexity and accuracy to production.
If anyone has any questions sound out in the comments or even hit me up on Twitter. My DM's are open and I love discussing serverless topics!!
Top comments (65)
Personally, flying blind is not my favorite use of time.
You can still do dirty ( untested, blind ) deploys with arc.codes if you like, but it is a much better idea to test your own logic locally before piling on the complexity of many distributed systems.
The only "anti-pattern" you outline here is using anything other than serverless framework which feels like a bad faith argument.
I am not suggesting you "pile on the complexity". You can still test just your code while it is in Lambda by sending it test data using something like sls invoke or other tool. Its also less blind because you ARE seeing it potentially interact with all the other issues that can crop up being in a remote, ephemeral environment such as AWS Lambda which is not similar to a local machine.
You need local development to speed up the feedback loop.
Once you've gotten all of the simple bugs you wrote out of the way then you can start fixing the bugs you didn't write.
No one but you has ever said it's a bad idea to test your code before deploying because that is just a best practice that is beyond rebuke.
Similarly, no one had ever said it's a bad idea to do integration tests against the actual services.
It isn't an anti-pattern to run emulations to save time, but it is an anti-pattern to try to confuse developers with marketing in an attempt to paper over a products shortcomings.
Even if the feedback loop is still fast without local? Using tools such as the Serverless Framework that allows you to deploy code changes in 3 seconds or less to AWS with the
serverless deploy function
command means you're feedback loop is still blinding quick AND you are still testing on the actual 100% production equivalent.This is also not marketing in any way. I am not sure what you believe I am marketing. This is my personal blog post written by me after 20 years in web development and 5 years building serverless applications, 2 of those years happen to be at Serverless, Inc.
I am sorry you feel so offended by my personal opinion but I stand by everything I said; we started doing development locally because to have every developer develop against a 100% equivalent of production in the past would be cost prohibitive. We have now gotten used to local even though, in my opinion its no longer needed; replicating a production serverless app is free.
I would appreciate less of the ad hominem attacks please.
It seems that unit tests are not important to you.
What kind of tests are you talking about?
Tests that target databases (production or not) are slower than mocking.
If you unit test, you can test your business logic regardless of what infrastructure you are using.
Test on a staging deployed environment is still needed for integration testing, but imho this is a step to take after unit testing.
Build and test locally (unit testing with NO connection to infrastructure (http, dB, filesystem,etc)) is much much faster than deploying to any service and also it gives you almost instant feedback and you can navigate in the code instantly using the ide.
Sorry I don't agree with this post.
Personally I am not a fan of unit testing in a serverless environment; and I wrote a blog post 2 years ago about how to do it too ... how times change. In a serverless application the amount of code you write is minimal compared to a traditional web application as the cloud services you use end up replacing a lot of the code for you. This is a good thing. And in that case, integration testing is far more important than unit testing 10 lines of code that insert an object into a database.
As Yan Cui recently said in a reply to one of my tweets "Speed of feedback is great, but only when it gives you the right feedback. e.g. if you're mocking AWS SDK and supposedly testing integration with DynamoDB then the test just tells you if your mock is working.
Learning the wrong thing faster is counter-productive."
The same is true testing on your Intel i7 with 8 GB of RAM and a 4k display. That is not the equivalent of a highly distributed application run across multiple machines in potentially multiple data centers, etc.
I can see your point.
But the amount of code and complexity always grows. Unit tests helps you test your logic on isolation, it doesn't really matter if your using a potato or a Xeon processor, it should be fast since the code that's being run, should in theory don't have much dependencies and you must mock each class' dependencies. You don't test your mocks here, you test your class in isolation of things outside of it, making different mocks for representing different cases.
This not only allows for writing tests as of it self, but allows you to have a good, scalable, easy to maintain codebase since you must use dependency inversion principles. Also TDD is possible, ci/cd is more reliable, etc.
Aaanyhow, this is deviating a bit I think from the post's argument.
In a serverless application, the logic is often a part of the infrastructure. An example of this is that you may have two services; lets call them a customer service and an order service. If the customer service receives a request to update customer details, all unfulfilled orders for that customer need to also have the customer details updated. In that case you use API Gateway to receive the initial PUT request to update customer data, the change is made in DynamoDB by a Lambda. That insertion triggers a DynamoDB stream entry which triggers a Lambda function. The Lambda pushes the DynamoDB action into EventBridge. The order service has a Lambda listening for customer change events and queries the orders table for all orders for that customer and performs the update. No unit test can test that entire flow and each lambda is perhaps 8-10 lines long.
Your right about non-unit tests capable of testing that entire flow, but again, that's not even a unit test.
Unit tests in this cases would be something like this:
On the order service you'll test the method 'onCustomerUpdated(customer)'
There's a bunch of things you can test depending on the use case.
For example, if you support in memory cache as well as the dynamo, you can unit test that when something calls the onCuatomerUpdated method (in production the caller would be the infrastructure itself), then check if the cache is updated as well as the dB, but using abstraction on this method.
Example using pseudocode-js
const cache = inmemoryRepository
const dB = DynamoDBRepository()
Function onCuatomerUpdated(cust) {
DB.update(cust)
.then(cache.update(cust))
.then (return 'success')
}
In a unit test environment, when you create that class to test it you would replace the dB and cache variables with mocks and you can verify if they were called in order, with what data and so on.
Sorry if there's anything badly written, I'm on my phone rn
You can imagine that if those services belong to different teams, each team would like to be sure that their code works as they intend regardless of how it's called. Some case would call that method my using a rest API from a client, maybe some other case would be the infrastructure itself calling it.
Abstraction and independence are key here I think
Side comment: I do use serverless and I absolutely love it
In fact, we develop and test locally precisely because the tools in the cloud are not yet perfect. Especially for special scenarios like debugging.
So I think local development is going to be the dominant approach for a long time.
Yeah debugging Lambda functions can be really painful. It's one of the reasons why we created SST (github.com/serverless-stack/server...). It hot reloads your functions while testing against the resources that've been deployed to AWS. This allows you to set breakpoints in VS Code. Here's a short clip of it in action â youtube.com/watch?v=2w4A06IsBlU
There is a lot more to my blog than just that. You may be able to execute code locally but it doesn't make it accurate compared to the environment it will eventually run inj. We have had this problem for years; I have never had a local development environment even 10 years ago I relied upon to give me accurate results. I always had to remote test to be sure. With the easy of Serverless deployments and them being 100% accurate to production means that issue no longer exists ... if I test in the cloud instead.
The ecosystem is not there yet. All the dev tools: runtime profiling, debug mode, discoverable dependance & "browsability".
At large scale, because lambda encourage large base code to be spread, it would be a real cost to deploy fully operating dev env (with profiling, verbose logging, etc) for every branch of every dev.
What would be interesting is to actually connect the "local env" to remote one, allowing lambda hot swap & hot reload. Should it be "coding in the cloud"? probably, as connectivity is becoming better and better, we wouldn't face the issue we had before, by excluding devs who doesn't have access to stable high speed connectivity.
As we are all targeting the shortest feedback cycle ever, targeting 15 min max (ex github, netflix, honeycomb), this is a real challenge for serverless.
If you are looking for the Lambda reload approach, check out SST (github.com/serverless-stack/server...). It deploys your entire stack to AWS but hot reloads your Lambda functions, so you get a tight feedback loop and the logs show in your terminal. This approach also allows you to set breakpoints in VS Code. Here's a short clip of it in action â youtube.com/watch?v=2w4A06IsBlU
Amazing :) I will check this! thank you
Interestingly, because lambda functions are so ephemeral and self contained, the impact of a single bad thread on the environment as a whole is entirely removed, meaning requiring complex runtime profiling isn't strictly necessary. However there are tools that let you profile the execution of Lambda functions as they run in the cloud
Debug mode imo is overrated when you can have logs streaming directly from a Lambda function in the same environment that your production Lambda functions execute in and invoke them from your local machine with a single command
This is no less possible with serverless deploying directly into the cloud. My IDE still shows me my dependancies etc locally.
I am not sure what this means, but I assume you mean the ability for a developer to understand the application the first time. I would say that because serverless architectures tend to reduce the amount of code written that understanding by other developers becomes easier, whether or not it runs locally or remotely. And you can "browse" a remote deployment just as easily as a locally emulated, the remote just doesnt have the possibility of failures due to unforseen differences between local and remote.
serverless deploy --stage gareth
. I have just deployed my own personal version of the stack to my cloud provider to play and experiment with. It likely costs nothing at deployment time to do so and probably nothing the entire time I am testing due to AWS free tier.Do that 1000 times as I have done, you should see the number of experimental projects I have deployed to AWS over the years and my last bill was $5 because I happen to have an EC2 instance running for a little while I was playing with.
you have legit points and your expressed opinion in the article is totally valid.
Just to clarify, I do run lambda on prod for 2 projects. and I did build serverless with gateway as soon as aws released them. one project at medium scale (team & load). I totally value the benefits of serverless.
Yet there is a wide reality:
for instance remote developers relying on bad connectivity(country side, far away from quality 4G nor landline).
we are nowhere close to hot reload instant feedback cycle. I do understand that you have to develop practice & workflow, and that serverless isn't to blame for the lack of mastery of TDD for example. But a lot of "bootcamp 3-weeks to pro dev school" or "3 years bachelors to fullstack web & native dev" do relies on "console.log" driven dev.
At the end, intuitive programming with "WYSIWYG" is still largely spread across junior to medium level developer.
The change management to move this legacy practice to new age one is long and costly. So when I say "the ecosystem is not there yet", it has this paradigm of operational cost & time to value as main factor.
I really do enjoy serverless, the possibility, the new paradigm. I was a prime user of managed service 15 years ago and still advocate them to people trying to run their own xxSQL home, messing up with 24H "backup". Yet serverless impose a (too) big step for devs, and we just need to make it affordable. Meeting us halfway :)
Great thoughts @garethmcc ð totally agree.
But we as a community should start figuring out how to âhot reloadingâ in the cloud. ð
"...added huge amounts of complexity for developers to have to handle and worry about" - no. Containers encapsulate the things an application needs to run. If as a developer knowing and working with the stuff that makes your work run means added complexity and stuff you can't handle, good luck in the future.
"...Running Nginx, ElasticSearch, Redis and MySQL on a single machine apparently uses a lot of memory" - somewhat true. Nginx needs only 64Mb of RAM to run, same for Redis. MySQL container needs 384Mb as a bare minimum and 512Mb makes a decent environment. ES needs 1Gb and 1 full CPU though. That's 2Gb right there, which means you may want to allocate about 3Gb to that particular Docker Desktop.
"...they siloed a single developer off for two months to replicate production as a collection of docker containers" - totally the wrong approach. The whole point of using containers is that they carry the environment. If you replicate production instead of replacing production with containers, then you're not aiming for an actual benefit. You should be running those containers (at least on application level) in production. External dependencies might vary as long as the underlying platform is similar but if you're not running the application containers in production then all you're getting is a way to share environment between devs. It's nice but maybe not worth this particular effort.
On the pure serverless side, I agree. Local solutions just aren't there. But it doesn't mean they're not worth investing effort in. Traveling contractors are still a thing. I myself need to develop every other day on train, plenty of spots without connectivity along the way. At home it's reliable but the occasional outage always comes to disrupt my state of flow or when there's an important task to do.
Analogy time. A business is like a home buyer. If I am the average home buyer I couldn't care less what techniques were used in the construction of the house I am buying; whether they were the latest and greatest in modern marvels or hand crafted by a neanderthal, I want a home that is well appointed, stays up and keeps me sheltered. In the same way a business couldn't care less HOW the developers build the solution, just that its done as fast as possible, as cheaply as possible and as reliably as possible. Also turn around time for adding new features should be good if possible. Developers then shouldn't need to have to learn how to do the plumbing and electrics if they don't need to. Its about solving the problem not trying to play with the latest tech.
Right now my RAM usage is 0 unless you count my IDE which I didn't in the original example so I won't here. I can build (and have) Serverless applications on my Raspberry Pi Model B+ from 2014.
The attempted benefit was to replicate production. Production was spread across 17 different virtual machines using multiple layers of caching and load balancing. My point was ... you cannot replicate production this way. You can with Serverless.
serverless deploy --stage mynewstackname
and production is replicated.My point was that local solutions have NEVER been "there". We have, over the years, required local development environments as a best effort emulation of the production environment because to ACTUALLY replicate production in the past was far too costly and time consuming. Serverless changes that entirely. You can have an EXACT replica of production up in a few seconds.
Local testing began as a necessary evil because all other alternatives were untenable. Local testing has now become this sacrosanct feature that all developers are taught is an absolute requirement for you to ever want to work in the field. For traditional development, we are, unfortunately, stuck with an inaccurate representation of production we need to test against locally. In the serverless world, an exact replica of production is a single command away.
" Its about solving the problem not trying to play with the latest tech". True, but that doesn't mean you shouldn't know what it takes to run your application.
I've encountered plenty of react developers that had no idea about the differences that come when running a development environment via "yarn start" and running a static build via nginx. Or PHP developers that have no idea about the impact of various PHP configurations.
If you don't go ahead and choose how your application runs, then the choice will be made for you and you might not like the outcome. Real life example: developers working with nodejs microservice, having no idea what tracing is, how to instrument their own application or how to customize logs.
This has nothing to do with playing with the latest tech. This has everything to do with knowing your tools and running your application. And if the optimised environment travels with the application, then it's all the better.
That only strengthens my point. If a react developer or PHP developer was testing against the exact 100% replica of what the production environment looked like they wouldn't be worried about "the choice will be made for you and you might not like the outcome. ". They are testing against the decisions from day one!
Serverless allows you, as a developer, to know EXACTLY how what you are building is going to operate in the cloud from the moment you start if you deploy to and test in the cloud.
Tracing and instrumentation? I shouldn't need to worry about that stuff! Let it be auto instrumented for me which it is in a serverless application.
My tools are the services I consume in the cloud, the code I write. My tools are not the OS, application software and myriad of potential container management options out there. As a developer building solutions for a business I need to concern myself with output and features, not the minutiae of implementation details. Thats where Serverless excels and the point of the article is to point out to other serverless developers that they are potentially missing an opportunity by not just developing against a deployed replica of production.
Early on in serverless days, deploys took ~10 minutes for a small change. Remember those days? That's one reason why developers wanted to work locally..
If you tweak AWS a little bit you can deploy under a second without the sls framework aws-blog.de/2021/04/cdk-lambda-dep...
True, but I did say "early on in serverless"
If you are using CDK, try out SST. It hot reloads your Lambda functions, so won't even need to deploy them.
github.com/serverless-stack/server...
I love that you also use a nice task runner like Go Task. :-) That goes above and beyond most blog articles. It's basically a fully featured solution out of the box to try! nice work.
Hey Winston. Times change. You can now deploy a code change in 3 seconds or less. You actually could back then too when we were hacking on things together but sometimes the little tricks elude you. serverless.com/framework/docs/prov...
For some things it can make sense. But how do you do basic debugging, like setting a breakpoint?
If you're looking to set breakpoints and debug Lambda functions locally checkout SST â github.com/serverless-stack/server...
It connects directly to what's been deployed on AWS without mocking or emulating them.
Oh now that looks exciting. Thanks :)
Check out lightrun.com in that case... Get literally the experience of a local debugger in production environment (I work there).
My code is written to capture errors and log out useful error messages if an issue occurs. If I am testing against an exact replica of production I may as well make my error management as expressive as it would be for production. You can't breakpoint production and still need to debug if an issue occurs:
I mean, we have incredible ides and debugging tools available. It'd be a huge loss to just not use them for anything at all anymore.
Why not just spin up duplicate cloud resources for development, and connect to those for local development? It solves the issue of mocking the cloud resource, without sacrificing any of the development tooling.
Maybe I misunderstood in your article, but people aren't really running mysql or redis locally, are they? It's just as easy to have pared down dev versions running in aws. I use scheduled batch scripts to turn them on/off each day, so they don't cost anything when nobody would be using them, and are fully up and running when they would be needed (so there's no waiting for it to spin up).
The only really challenging thing is event-driven. There's close to no examples of what an SQS payload looks like when triggered from S3 (for example), so creating a dummy lambda just to log the message, so you can mock it locally, seems to be the only way to start a project that consumes those events. But running the whole project in the cloud wouldn't solve that issue, either.
I don't see it as an anti-pattern, but with a bridge pattern in between, you will be able to test your stuff without any cloud solution behind it, which makes local development a snap and debugging as easy as normal.
Thanks for the feedback. I see it more than just about making developers lives easier. I see trying to execute code, even without the cloud services attached (I even wrote a blog post about setting this up two years ago), as inherently dangerous since it means developers are building for what "works on their machine" instead of directly against the 100% equivalent of production infrastructure. And with the existing tooling that's been around for 5 years now and is built into the framework by default its not even necessary
I know we come from far, but when I have to debug a Lambda by watching logfiles, it feels like 1980 all over again.
I understand that sentiment. Personally I prefer to use whichever method is the most accurate. The ability to tail logs and see debug output is about as rewarding as using a debugger. Its just different. Having an inline debugger would be great and that may come but until then I'd prefer sacrificing what I am used to.
It's possible now using SST, see recent comments from Jay in this thread
I would not say local deployment is an antipattern. But when you use proper unit tests locally you need it only in special cases.
Each developer should find her own fit.
I myself really like unit tests, try to avoid local emulation of lambda and then run integrations test in the cloud.
Especially with aws lambda you will not get the iam access denied errors catched testing locally...
Sounds like web development has built itself up a mountain of technical debt that needs to be addressed. If you can't test a piece of software locally, then that's a problem with the system, not with the concept of local testing.