At the last re:Invent 2022 AWS gave a lot of attention to the term Serverless. The main Keynote of AWS CTO Dr. Werner Vogels was very saturated with asynchronous approach and event-driven architecture. Also, new announcements were very closely related to these topics e.g. Step Functions & Event Bridge improvements.
Apart from all these trendy announcements, AWS also shows that it is trying to invest in technologies very cross-sectionally. It doesn’t cut itself off from Java technology, among others. This is evidenced by one of the largest Serverless announcements - SnapStart.
Due to the fact that I am associated with Java technology from the beginning of my professional career, I was very excited about this announcement. I was aware that so far JVM-based languages are not the main players for Lambda functions. Mainly because of its long cold start.
I haven't checked in a long time to see if anything has improved. So I decided that the release of SnapStart is a good time to evaluate cold starts for Java and its frameworks - Spring Boot, Quarkus, and Micronaut.
What is SnapStart?
Typically, AWS Lambda set up a new execution environment each time a function is first invoked or when the function is scaled up to handle increased traffic. As you probably know applications written in Java before accepting traffic need some time to initialize and start-up. This is the nature of JVM. SnapStart has been created to address this issue.
When SnapStart is enabled Lambda ahead of time creates a snapshot of initialized execution environment (memory and disk state) and persists it in the cache for low-latency access. This eliminates the need for the function to spend time on initialization (when the event came), as Lambda can quickly resume from the persisted snapshot instead.
I said "ahead of time" because Snapshot creation happens when you publish a function version and SnapStart works only for the published version of the Lambda function (can’t just use $LATEST
).
Normally Init
phase is the stage during Lambda performs multiple tasks like preparing the runtime container, downloading function code, initializing it, and so on… and then moving to the next phase. Init
phase is Limited to 10 seconds. When SnapStart is activated, the Init
phase happens earlier - yes, yes… when you publish a function version. In this case, 10-second timeout doesn't apply. Snapshot initialization can take up to 15 minutes.
Someone more curious might ask how the snapshot of the initiated function is possible to create? The answer is hidden behind a few magic terms.
First of all - CRaC (Coordinated Restore at Checkpoint) - open source project led by OpenJDK. It's focused on creating Java API responsible for saving and restoring the state of a JVM, including the currently running application - so-called checkpointing
. CRaC based on next key project - CRIU (Checkpoint/Restore in Userspace) - that allows application running on Linux system to be paused and restarted at some point later in time, potentially on a different machine. The last key piece of this SnapStart puzzle is Firecracker and his microVMs. SnapStart uses micro Virtual Machine (microVM) snapshots to checkpoint and restore full applications. Interestingly, it turns out that Amazon engineers from Firecracker and Corretto (AWS JDK distribution) teams were involved in CRaC project at the early stage.
This means that AWS has long since taken the first steps to address the cold start problem for Java. It confirms my thesis that AWS invests in breakthrough technologies and knows that Java and JVM are still important in the IT market.
But unfortunately, not everything is so beautiful. SnapStart and the methods on which it is based introduced some challenges - Due to the fact that it operates on a memory dump.
-
Randomness - all results of
java.util.Random
operations can be the same so usejava.security.SecureRandom
instead because Amazon handles it in Corretto. But if one of your dependencies uses the first one, you still be in trouble. - Connections keeping - state of connections that your function establishes during the initialization phase isn't guaranteed when Lambda resumes from a snapshot. In most cases, network connections that an AWS SDK establishes automatically resume but for other connections you need to handle it on your own.
- Stale credentials - the created snapshot also caches things like injected secrets and passwords (of course the whole snapshot is encrypted). Passwords can be rotated automatically. However, the snapshot can be used for a long period of time and not know anything about the password change. Our snapshot is immutable so will continue to use stale credentials. This applies not only to secrets. You need to protect yourself against such a case for any frequently changed data that you pull from an external sources into function memory. But don’t worry, you have tools to handle it e.g. post-snapshot hook.
- Other lacks of support for AWS Lambda - provisioned concurrency, arm64 architecture, EFS, larger ephemeral storage (max 512 MB)
I'm curious if SnapStart will remain a functionality reserved only for Java runtime or maybe it will turn out that AWS prepares a similar trick for other runtimes. Presumably, other runtimes can't use snapshotting in quite the same way as JVM and others wouldn’t make sense to even attempt eg. Golang. But, I would like to see a SnapStart kind of solution for Node, which also can have hiccups in cold start ... especially with a large number of dependencies.
Spring Boot vs Quarkus vs Micronaut - introduction of competitors
Before we get to the merits, a brief introduction of competitors. Overall, all 3 frameworks are similar in terms of functionality and are suitable for building web apps and microservices, but they have different design goals and trade-offs.
BTW all sources and codes, as always, can be found on my GitHub
Spring Boot
Spring Boot is the most widely adopted and well-established of the three frameworks. Pivotal product has also the largest and most active community. It has been around for over a decade. It provides a wide range of features and is highly configurable, making it a good choice for large and complex applications. However, Spring Boot is more resource-intensive than Quarkus and Micronaut. Spring Boot uses a traditional, Just-in-Time (JIT) compilation approach, which can result in longer startup times compared to Quarkus and Micronaut, which use Ahead-of-Time (AOT) compilation. AOT compilation pre-compiles the code at build time, resulting in faster startup times and smaller memory footprint. Also, runtime dependency injection adds some overhead and complexity to spring-based workloads.
Quarkus
Quarkus is a relatively new framework that aims to provide the same functionality as Spring Boot, but with a smaller footprint and faster startup time. A project initiated by RedHat was created to be used for native compilation for GraalVM. It aims to be effective platform for serverless, cloud, and Kubernetes environments. Quarkus uses Ahead-of-Time (AOT) compilation to reduce startup time and memory usage. The community strongly appreciates the speed and convenience of development. More and more projects boast of migrating microservice workloads from Spring Boot to Quarkus. At the end I can add that the documentation is really good.
Micronaut
Micronaut like Quarkus is a relatively newer framework, but it has been gaining popularity in recent years. It has a very spring-inspired programming model. It also uses Reactor (instead of Vert.x that Quarkus use). So if you are coming from a Spring world, in Micronaut you will find many similar patterns, techniques e.g. Mono and Flux from Reactor core. At the same time Micronaut aims to avoid downsides of Spring. It minimizes using reflections and proxies and doesn't use runtime bytecode generation. The source I found says that performance is a tiny bit better with Quarkus, but it's just negligible value.
Measurements and charts
Let's move on to the main point - measurements. They were the main reason for writing this article. I wanted to measure how a cold start looks in 2023 for Java and its most popular frameworks. How much SnapStart makes things better. If SnapStart also affect warm start? How resource changes (Lambda memory) affect a cold & warm start performance? I hope the charts and tables below will help you answer these questions.
Vanilla Java
Non SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 0% | 754.9 | 790.8 | 826.9 | 904 | 11.3 | 37 | 228.2 | 275.4 |
256 | 0% | 566.6 | 599.9 | 666.9 | 676.1 | 1.9 | 15.4 | 107.1 | 328.2 |
512 | 0% | 549.4 | 474.3 | 502.1 | 529.5 | 1.6 | 8.4 | 50.9 | 97.3 |
1024 | 0% | 426.1 | 445.5 | 466.1 | 489.7 | 1.6 | 3.3 | 20.6 | 25.5 |
4096 | 0% | 301.7 | 327.7 | 415.9 | 450.4 | 1.5 | 2.5 | 13.2 | 21.1 |
SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 0% | 705.3 | 773.3 | 817.8 | 896.2 | 17.4 | 52.6 | 268 | 479.7 |
256 | 0% | 401.6 | 447.9 | 473.2 | 536.6 | 7.7 | 20.3 | 120.2 | 214.3 |
512 | 0% | 231.9 | 261.9 | 311.7 | 1174.6 | 1.7 | 9.7 | 53.5 | 135.6 |
1024 | 0% | 203.9 | 231.6 | 367.3 | 399.1 | 1.6 | 3.7 | 24.8 | 53.4 |
4096 | 0% | 241.5 | 353.4 | 484.1 | 501.4 | 1.5 | 2.6 | 14 | 25.7 |
Spring Boot
Non SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 100% | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
256 | 10,5% | 5584.1 | 6867.7 | 7119.3 | 7157.4 | 28.3 | 1135.7 | 3582.5 | 3808.8 |
512 | 0% | 3515.4 | 3647.8 | 3725.2 | 3762.8 | 12.1 | 20.2 | 52.9 | 180.6 |
1024 | 0% | 3396.6 | 3512.3 | 3599.6 | 3599.6 | 3.9 | 9.2 | 18.5 | 94.7 |
4096 | 0% | 2366.4 | 2525.2 | 3127.5 | 3191.1 | 3.4 | 5 | 10.6 | 33.4 |
SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 100% | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
256 | 60.3% | 3920 | 5027.8 | 5149.8 | 5173.7 | 2399.7 | 3717.8 | 3931.8 | 4141.4 |
512 | 0% | 515.3 | 554.2 | 598.7 | 611.4 | 5.6 | 18.6 | 37.1 | 54.3 |
1024 | 0% | 347.3 | 381.1 | 451.6 | 1270 | 3.8 | 9.1 | 17.1 | 32.3 |
4096 | 0% | 350.4 | 417.3 | 604.7 | 641 | 3.6 | 5.7 | 16.9 | 65.1 |
Quarkus
Non SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 100% | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
256 | 0% | 3452.7 | 3543.6 | 3732.7 | 3757.3 | 50.5 | 73.1 | 213.8 | 317.2 |
512 | 0% | 2738.2 | 2818.7 | 2890 | 2899.4 | 16.5 | 34 | 93.1 | 189.5 |
1024 | 0% | 2305.7 | 2387.7 | 2512.6 | 4079.8 | 5.8 | 11.4 | 14.5 | 70.2 |
4096 | 0% | 1676.2 | 1823 | 2028.8 | 2046.3 | 4.4 | 7.7 | 18.4 | 42.6 |
SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 100% | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
256 | 0% | 1918.5 | 1977.8 | 2034.1 | 2063.4 | 48.8 | 63.5 | 168.2 | 264.1 |
512 | 0% | 1059.1 | 1115.8 | 1144.8 | 1148.2 | 16 | 33.7 | 94.6 | 172.5 |
1024 | 0% | 583.38 | 622 | 690.4 | 711.2 | 5.5 | 13.6 | 37.9 | 64.3 |
4096 | 0% | 455.7 | 498.6 | 556 | 566.1 | 4.3 | 7.4 | 20.2 | 57.2 |
Micronaut
Non SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 100% | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
256 | 0% | 3758.9 | 3912.2 | 4362.3 | 4262.7 | 29.1 | 46.8 | 174.1 | 315.7 |
512 | 0% | 3391.1 | 3626 | 3916.1 | 3941.4 | 9.3 | 18.9 | 55 | 151.1 |
1024 | 0% | 3146.2 | 3357.5 | 3680.2 | 3723.9 | 3.6 | 9.6 | 25.7 | 82 |
4096 | 0% | 2517.6 | 2628.2 | 2738.1 | 2881.9 | 3.2 | 4.6 | 12.7 | 45.9 |
SnapStart | Cold Start (ms) | Warm Start (ms) | |||||||
---|---|---|---|---|---|---|---|---|---|
memory (MB) | error rate | p50 | p90 | p99 | max | p50 | p90 | p99 | max |
128 | 100% | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
256 | 0% | 1725.4 | 1814.6 | 2257.8 | 2289.8 | 29.1 | 44.2 | 175.2 | 231.8 |
512 | 0% | 677 | 729.7 | 798.4 | 809.1 | 11.4 | 20.5 | 65.8 | 91 |
1024 | 0% | 468.6 | 518.9 | 626.3 | 1373.1 | 3.7 | 8.6 | 25.5 | 97.7 |
4096 | 0% | 388.8 | 439.2 | 562.1 | 594.3 | 3.1 | 4.9 | 16.7 | 67.1 |
Spring Boot vs Quarkus vs Micronaut vs Java - cold start charts
General conclusions and observations
- only pure Java Lambda can be run with 128 MB configuration and handle traffic. Frameworks fail with
java.lang.OutOfMemoryError
- Spring Boot framework fails part of requests also with 256 MB configuration. This spoils chart presentation for 256 MB
- function package size is the largest for Spring Boot (13,7 MB) and the smallest for Micronaut (11,8 MB) apart from pure Java that size is around 1MB
- the biggest difference in performance can be observed when upgrading memory from 256 MB to 512 MB. This applies to all frameworks, cold/warm start
- enabling SnapStart brings the greatest benefit for Spring Boot - almost x10 shorter cold start in few configurations. For Quarkus it is average x4 short and for Micronaut +/- x6
- for all frameworks with SnapStart enabled cold start comes close to pure Java cold start which is a fantastic result
- looking at the median graph, you can see that Quarkus had the shortest cold starts without SnapStart. On the other hand, with SnapStart enabled it performs worst
- SnapStart doesn't significantly affect warm start. It's hard to say if it has at all
Top comments (0)