"How Fast is Spring?"
is a session at Spring One Platform 2018. I watched the video and tried it by myself. So I introduce here what I did and the results.
I recommend you to watch this session video if you haven't yet. It's so interesting.
https://springoneplatform.io/2018/sessions/how-fast-is-spring-
Today's source code
https://github.com/bufferings/spring-boot-startup-mybench
↓I used OpenJDK 11.
❯ java --version
openjdk 11.0.1 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
↓You can run all the benchmarks like this. It will take a while because it runs all the benchmarks.
❯ ./mvnw clean package
❯ (cd benchmarks/; java -jar target/benchmarks.jar)
1. FluxBaseline
↓I created a project using SpringInitializr only with Reactive Web. Then, I wrote a tiny controller with WebMVC style.
@SpringBootApplication
@RestController
public class DemoApplication {
@GetMapping("/")
public String home() {
return "Hello";
}
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
↓The version of Spring Boot was 2.1.0.RELEASE.
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.0.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
↓It took 2.938 ± 0.287 s/op to start.
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
Now I got a baseline to check the startup time. Let's start from here.
2. WebMVC
↓I wondered what about WebMVC, not WebFlux? So I tried it. Maybe does it just mean the comparison of Tomcat and Netty?
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case02_Web ss 10 3.281 ± 0.342 s/op
WebFlux is a bit faster, isn't it?
3. spring-context-indexer
Next, I tried spring-context-indexer which seems to create component index.
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-context-indexer</artifactId>
<optional>true</optional>
</dependency>
↓Um... a little slower?
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case03_WithContextIndexer ss 10 3.063 ± 0.102 s/op
↓I checked spring.components
, and found it contained only 1 component. I see... I should try with a bigger project to know the effect.
#
#Sun Nov 04 18:42:59 JST 2018
com.example.DemoApplication=org.springframework.stereotype.Component
4. Lazy Initialization
Tried Lazy Init.
@Configuration
public class LazyInitBeanFactoryPostProcessor implements BeanFactoryPostProcessor {
@Override
public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory) throws BeansException {
for (String beanName : beanFactory.getBeanDefinitionNames()) {
beanFactory.getBeanDefinition(beanName).setLazyInit(true);
}
}
}
↓Here's the result. It became just a little bit faster.
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case04_WithLazyInit ss 10 2.844 ± 0.129 s/op
5. NoVerify
Ran with -noverify
:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case05_WithNoVerifyOption ss 10 2.582 ± 0.060 s/op
It became a little faster. I don't know what it means, so I need to check it later sometime.
6. TieredStopAtLevel
Ran with -XX:TieredStopAtLevel=1
:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case06_WithTieredStopAtLevel1Option ss 10 1.980 ± 0.037 s/op
Uh, much faster! It took less than 2 seconds. But I don't know this flag, too. So I will check it later.
7. Specify SpringConfigLocation explicitly
Ran with -Dspring.config.location=classpath:/application.properties
:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case07_WithSpringConfigLocationOption ss 10 3.026 ± 0.139 s/op
Um, it became slower.
8. Turn off JMX
Ran with -Dspring.jmx.enabled=false
:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case08_WithJmxDisabledOption ss 10 2.877 ± 0.097 s/op
It became a little bit faster.
9. Exclude Logback
From here, I try to exclude libraries. At first, excluding Logback:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
<exclusions>
<exclusion>
<artifactId>spring-boot-starter-logging</artifactId>
<groupId>org.springframework.boot</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-jdk14</artifactId>
</dependency>
Here's the result:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case09_WithoutLogback ss 10 2.904 ± 0.096 s/op
mm... slightly improved?
10. Exclude Jackson
Next is Jackson
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
<exclusions>
<exclusion>
<artifactId>spring-boot-starter-json</artifactId>
<groupId>org.springframework.boot</groupId>
</exclusion>
</exclusions>
</dependency>
The result:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case10_WithoutJackson ss 10 2.789 ± 0.093 s/op
It became a little bit faster.
11. Exclude HibernateValidator
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
<exclusions>
<exclusion>
<artifactId>hibernate-validator</artifactId>
<groupId>org.hibernate.validator</groupId>
</exclusion>
</exclusions>
</dependency>
Here's the result:
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case11_WithoutHibernateValidator ss 10 2.857 ± 0.084 s/op
Slightly improved, too.
This is the end of library exclusion.
12. AppCDS
AppCDS (Application Class Data Sharing) was included in Oracle JDK as a commercial feature. But it became available from OpenJDK 10.
It seems AppCDS dumps information into a shared archive, so startup time becomes shorter.
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case12_WithAppCds ss 10 2.957 ± 0.079 s/op
mm... it wasn't faster... then I checked articles about CDS, and found the reason.
With SpringBoot FatJAR, the libraries are out of the scope of CDS.
13. Flux with Thin Launcher
Uh, I'm sorry, but the benchmark name "Exploded" is wrong. Once I tried to explode the FatJAR, but I couldn't use CDS with the exploded JAR after all. So I switched to use Thin Launcher. Please take the benchmark name "Exploded" as "Thin Launcher".
Before using CDS, I would like to check the speed of JAR file packaged with Thin Launcher.
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<dependencies>
<dependency>
<groupId>org.springframework.boot.experimental</groupId>
<artifactId>spring-boot-thin-layout</artifactId>
<version>1.0.15.RELEASE</version>
</dependency>
</dependencies>
</plugin>
</plugins>
Although I used Thin Launcher to package the app, I didn't use the launch class of Thin Launcher, but specified Main class to make the startup time as fast as possible.
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case13_Exploded ss 10 2.476 ± 0.091 s/op
hum, a bit faster, isn't it?
14. Thin Launcher + CDS
Now I would like to apply AppCDS to it.
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case14_ExplodedWithAppCds ss 10 1.535 ± 0.036 s/op
Wow! It became much faster!
15. All applied
Finally, I applied everything.
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case15_AllApplied ss 10 0.801 ± 0.037 s/op
Less than 1 second! (∩´∀`)∩yay
One more step
In the Dave's session, he mentioned "Functional Bean Definitions", tried improvements with Spring without SpringBoot and the app became much faster. I need to learn more to understand them.
Result list
Benchmark Mode Cnt Score Error Units
MyBenchmark.case01_FluxBaseline ss 10 2.938 ± 0.287 s/op
MyBenchmark.case02_Web ss 10 3.281 ± 0.342 s/op
MyBenchmark.case03_WithContextIndexer ss 10 3.063 ± 0.102 s/op
MyBenchmark.case04_WithLazyInit ss 10 2.844 ± 0.129 s/op
MyBenchmark.case05_WithNoVerifyOption ss 10 2.582 ± 0.060 s/op
MyBenchmark.case06_WithTieredStopAtLevel1Option ss 10 1.980 ± 0.037 s/op
MyBenchmark.case07_WithSpringConfigLocationOption ss 10 3.026 ± 0.139 s/op
MyBenchmark.case08_WithJmxDisabledOption ss 10 2.877 ± 0.097 s/op
MyBenchmark.case09_WithoutLogback ss 10 2.904 ± 0.096 s/op
MyBenchmark.case10_WithoutJackson ss 10 2.789 ± 0.093 s/op
MyBenchmark.case11_WithoutHibernateValidator ss 10 2.857 ± 0.084 s/op
MyBenchmark.case12_WithAppCds ss 10 2.957 ± 0.079 s/op
MyBenchmark.case13_Exploded ss 10 2.476 ± 0.091 s/op
MyBenchmark.case14_ExplodedWithAppCds ss 10 1.535 ± 0.036 s/op
MyBenchmark.case15_AllApplied ss 10 0.801 ± 0.037 s/op
It was really interesting. Thank you!
Top comments (4)
Nice comparison. Would be great to undestand what those flags actually do.
And what happens, you exclude Jackson? You just return a string but will Spring still be able serialize an object to json?
BTW: If it’s about startup time, you should try GraalVM 😎
A really interesting exposiion, thanks! Probably just the scientist in me, but plotting the summary in a graph would be really cool :).
(Just one comment: when two measurements have different values but widely overlap when you take into account the error flags, then the correct thing would be to say that there seems to be no difference between the two)
you can also try to add gralvm and native image
Interesting experiment. It‘s amazing how much room for startup optimization there is..