This paper is available on arxiv under CC BY-SA 4.0 DEED license.
Authors:
(1) Juan Mera Men´endez;
(2) Martin Bartlett.
Table of Links
IV. BEST PRACTICES AND TECHNIQUES
Below are the proposed techniques and approaches to reduce the cold start of Java functions and improve their overall performance. Along with their measurements in terms of performance tests. The proposed practices will be compared in the VI.Discussion section.
A. Better suited configuration
Increasing the allocated memory for a function is probably the most well-known way to reduce cold starts and also the simplest. The allocation of memory to a function is related to the amount of CPU assigned to that function. A greater amount of CPU generally makes the initialization of the function faster. However, it’s not always the case that more memory equals better performance. It’s also necessary to consider that higher memory allocation will result in higher costs, but if the computing time is reduced due to increased performance, the overall cost will decrease as well. To properly balance the memory configuration of a Lambda function, the Lambda Power Tuner solution [19] can be of great help to profile the function, displaying graphs such as the one shown in Figure 2 for Catalog lambda of our solution.
In our case, chose the right memory value for the functions configuration suppose an improvement of about 70% in terms of cold start going from an average of 16307 ms to 4867 ms. In addition to a noteworthy 46% when the functions are warmed passing from an average of 512 ms to 216 ms as can be seen in Table II. For the rest of the versions that do not have an appropriate memory configuration, the default value of 512 MB is used.
B. Snapstart
Snapstart [16] is a feature offered by AWS to mitigate the cold start of Java functions. It involves initializing the function at the moment a version of it is created. Lambda takes a snapshot of the initialized lambda’s runtime environment, encrypts and caches it. When initializing new functions, it does so from the snapshot rather than starting from scratch, thus improving performance.
A notable drawback of this feature is that it does not support many other features offered by AWS Lambda, such as custom runtimes, the arm64 architecture, Amazon EFS, and other managed runtimes that are not Java 11 or Java 17.
In this case, enable Snapstart on our functions allow an enhancement of 16% at the cold start (going from an average of 16307 ms to 13736 ms) and an improvement of 21% attending requests when the functions are warmed making a difference of over 100 ms in each request answered (Table III).
C. Arm64 architecture
The instruction set architecture within AWS Lambda, either x86 64 or arm64, determines the processor that the function will use. AWS’s official documentation suggests that the arm64 architecture [18] with its Graviton processor can offer significantly better performance than the x86 64 architecture.
Regarding our solution, the impact of using arm64 on the latencies was positive, getting over a 14% of improvement on cold starts and a 13% when the lambdas are warmed. Reducing the average latency by over 2000 ms and approximately 70 ms, respectively. The warmed lambda functions data can be seen in Table IV.
D. AWS SDK v2 for Java
The AWS SDK for Java 2.x [15] is the second version of the SDK provided by AWS for Java. This version is designed with a modular approach, allowing you to include only the service-specific JARs you need, reducing the size of your artifacts. That’s a key point of reducing the cold start, loading less number of classes when initialize the function. Furthermore, it boasts several other performance improvements such as asynchronous nature, connection management, concurrency support, resource management, and so forth. Due to that, in terms of performance its use is highly recommended.
For us, migrate the lambda functions from sdk v1 to sdk v2 suppose an enhancement of about 40% in both cold start and warmed functions. On first going from 16307 ms to 9926 ms and on second one passing from 512 ms to 298 ms of latency as is visible in Table V.
E. GraalVM
GraalVM [20] is a platform that enables ahead-of-time compilation of Java code, leveraging the Native Image tool to produce a standalone binary. As a result, this binary offers significantly better performance compared to traditional Java code that runs using the JVM and is compiled at runtime. The binary is produced for the OS of the environment the code is built. It’s important to note that AWS Lambda is only compatible with GraalVM through the use of custom runtimes. Other disadvantage is that the ahead-of-time compilation sometimes would be tricky to make it work properly adding even the necessity of dealing with low level issues caused by AOT compilation.
For us, create a native image of our code and uses it generates a great improvement of 83% on cold start, going form 16307 ms to 2800 ms. When the lambdas are warmed, it goes from and average of 512 ms to 231 ms which means a 55% of enhancement, visible in Table VI.
F. Environment variable: JAVA TOOL OPTIONS
AWS Lambda provides some customization options [17] for the managed Java runtime, which can enhance overall function performance. This is achieved through the use of the JAVA TOOL OPTIONS environment variable, which allows access to JVM features such as tiered compilation or garbage collector behavior among others. In this case we use only tiered compilation, fixing the level to 1, enabling the C1 compiler which means optimized code for fast start-up time. The exact value for the mentioned key is ”-XX:+TieredCompilation - XX:TieredStopAtLevel=1”. Java 17 managed runtime is default set to level 1 for AWS Lambda, while in Java 11, you need to set it manually. Level two can be used for an overall better performance, not only focus on the start
time. In our test system, enabling that option translates to a 40% improvement in cold start times and a 31% improvement in response times for the system with warm Lambdas (Table VII).
G. Others
In addition to the previous techniques and approaches we propose and have tested, there are many other best practices [6], [14], [21]. Whether they are more dependent on the use case or more general, below, we mention some of them:
• Reduce the size of the bundle to enhance the initialization phase. As less classes to load as fast the function will initialize.
• Initialize the connections of third party out of the handler and same the static resources on /tmp to reuse it from one execution to another.
• A framework, if needed, should be as lighter as possible. Due to that we suggest replacing spring with other smaller frameworks like Vert.x
• Try to avoid reflection, that feature makes some JVM optimizations impossible to be done as it works with dynamic types.
• Initialize all necessary dependencies and classes during initialization time.
These approaches, combined with some of the previous ones we have tested, for sure will greatly reduce the impact of cold starts and increase the performance of Java functions by a really significant percentage.