Our adventures with GraalVM — The good, the bad and the ugly

Studio Stories

Java GraalVM Performance

At Valensas, we are always on the lookout for new technologies and ways to improve our software. GraalVM was on our watchlist for quite some time and finally we gave it go with the official support from Spring Boot launched with version 3.0.0.

What is GraalVM?

GraalVM is a JVM/JDK distribution from Oracle. It promises low resource usage and fast startup times. It also allows to compile your code into architecture-specific binaries (called native images), which allows for even more optimizations and small package sizes (with native images, there is no need for a JVM to be installed at runtime).

Our use-case

Since a few years, we are developing Bitronit, a feature-rich cryptocurrency exchange platform, composed of 30+ micro-services, most of them using Kotlin and Spring Boot. We deploy our system on an on-prem Kubernetes cluster. One of our main problem was the resource usage of our services. Because of the overhead of the JVM, we can’t optimize much ourselves, especially when our services are sitting idle (dev/test environments). Moreover, because of high resource usage especailly at startup, we need to allocate much more resources to our services then required at runtime. Given the number of micro-services, the required resources quickly adds up. This is the main reason we wanted to test out GraalVM. Performance improvements weren’t our main objective but it would definitely be a welcome one.

The good

GraalVM actually delivered on all its promises.

Container image sizes

Using statically linked native images, we were able to containerize our images FROM scratch , which allowed us to have the smallest possible image, getting rid of any operating system layers — and their security vulnerabilities too! Most our services had image sizes around 150MB. With GraalVM statically linked native images, it got reduced to around 50MB. That’s a 3ximprovement!

Startup times

When we measured startup times for a simple service, we also see a huge improvements. For a relatively small service, given 2 CPU cores, a classical Jar boots at around 5,9 seconds, while a GraalVM native image boots at around 420ms. Thats’a 14x improvement!

Performance

To compare resource usage between classical jar builds and GraalVM native images, we performed a small load test using Locust. Here are the results.

Jar test results:

GraalVM Native image test results:

While the CPU usage for GraalVM native image is a bit higher, the RPS is much higher and much more stable, while memory usage is considerably lower. The large ups and downs on the Jar build is mostly related to garbage collection pauses. GraalVM manages memory much more efficiently and require fewer and shorter garbage collection pauses.

The bad

I hear you thinking, “These are amazing improvements. There’s a catch right? What’s the catch?”. Yes, there is a catch, multiple if we’re being honest.

Slow and resource-hungry build times

GraalVM achieves these performance improvements using static analysis techniques. Performing all these operations are slow and very resource-heavy. On my development machine (2021 14" MacBook Pro with 16GB memory, M1 Pro chip), an average build lasts around 10–15 minutes (vs 1–2 minutes for a classical Jar build). Many builds also slow down the computer so much, it’s almost impossible to do anything else (mostly because builds causes memory swapping).

No conditional beans at runtime

When using GraalVM native images, Spring determines which beans to create at build time. Therefore, if you use @Conditional annotations, you won’t be able to change these with environment variables or other external configuration methods.

As an example, imaging you have an email service which supports SMTP and AWS SES as backend and want to configure which backend to use at runtime. You would have a code like this:

interface EmailService { fun sendMail(email : Email)}@Service@ConditionalOnPorperty("email.backend", havingValue="smtp")class SmtpEamilService: EmailService { override fun sendEamil(email : Email) { ... }}@Service@ConditionalOnPorperty("email.backend", havingValue="ses")class SesEamilService: EmailService { override fun sendEamil(email : Email) { ... }}

If you have the email.backend property set to smtp at build-time, you will never be able to use AWS SES as backend. This is one of the reasons how your application can boot so fast — Spring already knows which beans to create and can skip all the auto-configurations and conditional expressions at runtime.

However, we found an acceptable workaround that can be applied to keep this feature: extracting the services into a bean function and conditionally creating one or the other. The code above can be changed like this:

interface EmailService { fun sendMail(email : Email)}class SmtpEamilService: EmailService { override fun sendEamil(email : Email) { ... }}class SesEamilService: EmailService { override fun sendEamil(email : Email) { ... }}enum EmailBackend { SMTP, SES}@ConfigurationProperties(prefix = "email")data class EmailConfigurationProperties( val backend: EmailBackend)@Configuration@EnableConfigurationProperties(EmailService::class)class EmailConfiguration { @Bean fun emailService(emailProperties: EmailConfigurationProperties): EmailService { return when (emailProperties.backend) { SMTP -> SmtpEmailService() SES -> SesEmailService() } }}

The workaround is less idiomatic, more verbose and does not work for all use-cases but it’s acceptable and we were able to find similar workarounds other use-cases.

Limited reflection support

When compiling native images, all reflection information is stripped away unless explicitly specified. If you use reflection on your code, you must tell GraalVM which class/method/property information to keep for reflection. Otherwise, you will end up with NoClassDefFoundError, ClassNotFoundException, MethodNotFoundException etc. Hopefully we don’t use much reflection in our code so adding the few bits here and there was not that much an issue.

The ugly

Remember when I told you we weren’t using much reflection in our code? You know what uses reflection? Your dependencies. Pretty much all dependencies for the Spring Boot ecosystem relies on reflection. It’s everywhere! You serialize and deserialize request and response model with Jackson? It uses reflection. You integrate to a SOAP api with JaxB? It uses reflection. You use Spring Data to connect to your database? You guessed it, it uses reflection. This limitation became a nightmare very quick.

To be fair, Spring Boot does a nice job adding automatic runtime hints for the parts it can. For example, all the models used as requests and response in your controllers will have the necessary runtime hints added automatically. However, it is not enough. You still have models used with your WebClient , databases entities, and more.

Here is simple example that took us days to figure out:

data class MyEntity( @Id val id: Long?, val createdDate: Instant)@Repositoryinterface MyEntityRepository: ReactiveCrudRepository<MyEntity, Long>fun findAll(repository: MyEntityRepository): Flux<MyEntity> { return repository.findAll()}
For this specific scenario, we were using Spring Data R2DBC with a Postgresql database. We also added necessary runtime hints for MyEntity. This code seems fine right? Well, it failed with the following error when running as a native image:
No converter found capable of converting from type [java.time.OffsetDateTime] to type [java.time.Instant]

That is weird. Why would Spring Data try to convert OffsetDateTime to Instant ? There is no OffsetDateTime being used. What happened to converters when migrating to a native image? Did they just vanish? It turns out that Spring Data R2DBC fetches all timestamptz columns to an OffsetDateTime , then tries to convert it to the type in your record, using pre-defined Coverters. This specific OffsetDateTime to Instant conversion does not have an dedicated converter and ends up in ObjectToObjectConverter, which handles it — you guessed it, by heavily relying on reflection.

The only thing to do, was to add reflection hints for OffsetDateTime.toInstant() method. However, since the error was not descriptive enough to pin-point the issue, we had to dig deep into Spring and Spring Data to understand what it was trying to do, and what was the root cause, which was very time consuming. This was just an example of many similar issues we faced. Once we catalogued all these scenarios, we created a small Java library to facilitate our most common use-cases, which helped but can not cover all use-cases.

Another issue is with testing. We heavily rely on Mockito Kotlin for our tests, which, you guessed correct again, heavily relies on reflection. However, this was another level of ugly. Because we use Mockito for pretty much everything, we had to make a decision among the following:

Have native image tests, add reflection hints for almost everything, sacrifice on image size.
Have native image tests, add reflection hints for almost everything for tests only, potentially miss necessary reflection hints for production binary.
Test the old way, potentially miss necessary reflection hints.

We chose the last option. After all, we have end to end tests with Gauge, that can catch any issues. However, because of the resource-hungry and long build times, developers do not properly test their code with native images before pushing their code and it quickly breaks the test environments. We have migrated over 20 services to native images to this day and it is still a rare sight to see a native image migration go without an issue on the first try.

Verdict

GraalVM support for Spring Boot is relatively new and I’m sure it will get better over time. Many framework still don’t bundle their own runtime hints, which makes everyone’s life harder but GraalVM Reachability Metadata tries do cover these. Hopefully, we will one day be able to use native images with much less hurdle.

Will we use GraalVM for our next project?

It is hard to tell. Even though we are glad to have had this experience and definitely improved our systems thanks to GraalVM native images, the cost on developer happiness and development speed is non-negligable. Therefore, if resource/performance constraints is not an issue, relative to the project’s scale, we will most likely use classical jars builds.

Additionally, since this adventure, we have added Rust to our technology stack, which delivers all the good benefits of GraalVM native images without the drama of reflection issues. Even if it has a steep learning curve, you just don’t get surprised at runtime because of a missing runtime hint. Hence, if resource and performance is a strict requirement, Rust will most likely be our choice over GraalVM native images.

Another option could be to use Quarkus instead of Spring Boot, which is designed from the beginning with native images in mind. However, haven’t had the chance to try it yet.

Should I use GraalVM?

Do you have the resources to figure out all the tricks to make your application work?
Do you have the time to teach the junior developers about the limitations of native images and how to work around them?
Do you know the ins and outs of your dependencies, where and how they use reflection?

If your answer to all these questions is a yes, then go ahead without a doubt and get all the good benefits of native images. Otherwise, you should probably think multiple times before going for it.

Special thanks to Özgür Deniz Türker who performed and shared the load tests present in this story.

What is GraalVM?

What is GraalVM?

Our use-case

Our use-case

The good

The good

Container image sizes

Container image sizes

Startup times

Startup times

Performance

Performance

The bad

The bad

Slow and resource-hungry build times

Slow and resource-hungry build times

No conditional beans at runtime

No conditional beans at runtime

Limited reflection support

Limited reflection support

The ugly

The ugly

Verdict

Verdict

Will we use GraalVM for our next project?

Will we use GraalVM for our next project?

Should I use GraalVM?

Should I use GraalVM?

İlgili Yazılar

Multi-Cluster Kubernetes Architecture on the PCI DSS Journey

Comparing Hibernate, Hibernate Reactive, Hibernate with Virtual Threads, and R2DBC: A Performance Evaluation

Quartz in depth for Spring Boot & a qol library Simply Quartz

What is GraalVM?

What is GraalVM?

Our use-case

Our use-case

The good

The good

Container image sizes

Container image sizes

Startup times

Startup times

Performance

Performance

The bad

The bad

Slow and resource-hungry build times

Slow and resource-hungry build times

No conditional beans at runtime

No conditional beans at runtime

Limited reflection support

Limited reflection support

The ugly

The ugly

Verdict

Verdict

Will we use GraalVM for our next project?

Will we use GraalVM for our next project?

Should I use GraalVM?

Should I use GraalVM?

İlgili Yazılar

Multi-Cluster Kubernetes Architecture on the PCI DSS Journey

Comparing Hibernate, Hibernate Reactive, Hibernate with Virtual Threads, and R2DBC: A Performance Evaluation

Quartz in depth for Spring Boot & a qol library Simply Quartz

Hedeflerinize uygun vaka çalışmaları, teknolojiler ve makaleler bulun hedeflerinize uygun.