Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
former_member197561
Active Participant
How to determine the Java memory requirements for customer load?
How to define optimal number of Java instances to handle the load?

***

Introduction


Often, instead of Java Memory Sizing for an application, а “guru-defined” ratio, e.g.”4GB per vCPU”, or “4 GB memory per 1000 SAPS”, is applied and later, in production, “hacks” to JVM parameters are experimented as reaction to already experienced java memory problems.

This blog suggests a pragmatic approach to calculate the required physical memory and the optimal sizes of the different java memory areas, based on the application-specific memory consumption metrics.

The Java Memory Sizing always relates to a concrete java application version, i.e., the sizing result changes if in subsequent releases there is code optimization or degradation.
It covers:

  • Optimal Heap configuration


Size of Young Generation with which as few as possible short-living objects are copied (promoted) into the Old Generation and size of Old Generation with which full garbage collections frequency is as low as possible.


  • Optimal non-Heap configuration


Size of Metaspace (or Perm Generation up to jvm7), Compressed class space, Native area.


  • Optimal number of JVM instances


The choice depends on requirements for clustering, failover, IaaS providers capacities vs cost concerns, etc. but can also be a trade-off to too large heap requirements.


  • Physical memory size


The physical memory which must be available and free on the machine to run the defined number of JVM.

Be aware that Java Memory Sizing is only applicable to memory-leaks-free applications – the Java memory sizing approach helps to avoid only memory shortage situations which are not memory leaks.

 

JVM Memory Overview


Typical for Java applications is that

  • Most of the created objects become unused very soon, i.e., they are  “short-living”

  • There are objects and classes, which once instantiated, remain in memory for a very long time and even until the JVM is stopped, i.e., they are  “long-living”


To reflect this reality, Java VM introduces a concept of separate memory areas with sizes that can be controlled with JVM parameters.












The Young Generation area (consisting of Eden and Survivor areas) holds the short-living objects.

The Old Generation and the non-Heap areas hold long-living objects and classes.

Note!
Even if it is possible to configure different values for minimum and maximum area size, the practical experience shows that resizing the areas at runtime is expensive; therefore, it is recommended to always configure non-resizable areas with sizes as calculated by the java memory sizing:

  • -Xms same value as -Xmx

  • -XX:NewSize same value as -XX:MaxNewSize

  • -XX:MetaspaceSize same value as -XX:MaxMetaspaceSize



Java VM manages objects de-allocation automatically, i.e., performs the so-called garbage collection (gc) using different garbage collection algorithms which continuously improve in subsequent JVM versions. Historically, in subsequent JVM versions more than one gc algorithm is supported (e.g. deprecated, experimental, default) and they can be switched with JVM parameters, e.g. -XX:+UseSerialGC, -XX:+UseParallelGC, -XX:+UseParNewGC, -XX:+UseG1GC, -XX:+UseZGC.

One aspect of the garbage collection is the gc duration, which depends on

  • type of gc


The major (full) gc examines the entire heap for unnecessary objects, that is why it is slower compared to small GC which examines only the Young Generation.


  • gc algorithm


Use the most recent gc algorithm available with your concrete java version.


  • sizes of the memory areas


On bigger memory areas, the duration of both small gc and full gc is longer.


  • CPU speed


With slower CPUs, the duration of both small gc and full gc is longer.


  • app-specific coding


E.g. usage of “finalize()” method, “soft references” and so on prolongs the gc duration.

OS Memory Paging is the worst enemy of garbage collection – due to it, the gc duration becomes unacceptably long; therefore, all allocated by a java process memory must fit inside the physical memory all the time (no OS Paging should happen!).

Another aspect of the garbage collection is the frequency of the gc events . Too high frequency causes slow performance: therefore, the goal of the Java Memory Sizing is to avoid high gc frequency.

Logical Memory Spaces of Java apps


On logical level, there are three spaces which determine the memory consumption profile of a java application.









The Processing Space is filled objects, created during the processing of the request and collected already during processing, or soon after. Typically, they are garbage collected by a small gc directly from Young Generation space.User Session Space is relevant only to stateful applications!

The User Session Space is filled with objects, referred from the user session object. Normally, a user is active for at least several minutes, therefore the user session related objects usually live long enough to be promoted to Old (Tenured) Generation and from there they are removed only by major (full) garbage collection.

Framework Space is filled with objects created during initialization of the java process, e.g., services, libraries, applications, etc. and objects of shared caches, pools, etc. Such objects typically live for the entire life time of the java virtual machine and therefore are always promoted to the Old (Tenured) Generation.

To Framework space belong also the classes which are loaded into the non-Heap area.


Step by Step Java Memory Sizing



I. Define target Isgc and Ifgc


The frequency, defined as interval between two successive garbage collection (GC) occurrences, is a required input parameter for the java memory sizing.

Let‘s define Isgc and Ifgc













Isgc Interval of small garbage collection (in seconds)

Usually, Isgc should be in range with the average java server-side processing time: e.g., if the average java server-side processing time is sub-second, choosing Isgc=1s is good enough. Isgc < 1s brings performance penalty.

Anyway, even when the server-side processing time is longer than 3 seconds, there will be enough “garbage” to collect with already Isgc=3s and longer Isgc brings oversizing.

Therefore, unless there are solid reasons, choose Isgc between 1s. and 3s.
Ifgc  Interval of full garbage collection (in seconds) Usually, Ifgc should be in range with the average user session duration, e.g., if the average duration of a user session is 10 minutes, then Ifgc=600s will be good enough. Shorter Ifgc will lead to inefficient full gc which will not manage to collect many objects, because they are still in use, and even can lead to OOM situations.If the average duration of a user session is 30 minutes, Ifgc=1800s will be good enough. Anyway, do not choose Ifgc longer than 1800s – it will be oversizing.


II. Determine the Framework Space Objects


The Framework Space is independent from the load which is produced by the requests of active users or API calls. It depends on the implementation of the concrete Java Server, on the number and type of deployed applications.

To determine the Framework Space size perform analysis of a heap dump, taken after warm-up: i.e., start the java instance and run several times the functional scenarios of the sized application. To analyze the heap dump you can use the Memory Analyzer tool. For stateful applications, the heap dump should be taken after the user(s) are no longer active – the session(s) are destroyed (e.g., logged out, expired).

Having the detailed memory insides, which Memory Analyzer tool shows, use the opportunity to identify optimization potential in the memory consumption of your application.

The number of objects and size can be looked up also using the jvm standard tools, e.g., jmap  -histo  <pid>.


III. Determine the Framework Space Classes


At this phase determine the total number of loaded classes (Nr_Classes), using jvm standard tools, e.g., jmap  -clstats  <pid> . Plan for average 20 kb per class.

(F1) Framework Space Classes [MB]= 20 * Nr_Classes / 1024


IV. Calculate the Processing Space


To calculate the Processing Space, apply the formula:

(F2) Processing Space [MB] = 2 * NrRequests * ProcessingMemory[MB] where the meaning of the parameters is as follow:























Parameter Origin Description
ProcessingMemory Measurement

The average allocated memory in MB per request .

To measure the processing space, use the Allocation Analysis of the SAP JVM Profiler tool .

Most easy, record the allocation analysis for a full scenario run, and divide the total allocation by the number of requests per scenario to calculate the average memory allocation which is meant as processing space.

In the Processing Space are included all objects allocated in a thread, or in multiple threads, during the processing of a request.
NrRequests Sizing input

The number of requests (customer triggered load) which will be processed in the choosen Isgc.

Recommended: 1 ≤ Isgc ≤ 3
Processing Space Result The processing memory in MB, required for handling the load.


V. Calculate the User Session Space


Note: For stateless applications, the user session space is 0, therefore skip this section.

To calculate the User Session Space, apply the formula:

(F3) User Sessions Space [MB] = 1.5 * NrSessions * UserSessionMemory [MB] , where the meaning of the parameters is as follow:























Name Origin Description
UserSessionMemory Measurement

The average user session size in MB.

To determine the User Session Space perform analysis of the heap dump. The heap dump should be taken while multiple users are still active (logged in) to the java instance and they are “warmed up”, i.e., they have executed at least once the functional scenarios of the sized application.

For User Session Space analysis, similar to Framework Space analysis, the Memory Analyzer tool could be used.

Note that this heap dump will contain also the already measured framework space, therefore you must consider as user session space only the additional memory which is allocated. Ideally, a product expert can do deep-dive analysis to identify the objects which belong to the user session and its typical size.

The number of objects and size can be looked up also using the jvm standard tools, e.g., jmap  -histo  <pid>.
NrSessions Sizing input

The number of new user sessions (number of user login operations), created during the choosen Ifgc.

Recommended: 300s ≤ Ifgc ≤ 1800s
User Sessions Space Result The user session memory in MB, required for handling the specified number of user sessions.

Memory Calculation. Java VM configuration.


There may be different reasons for choosing to run productively with one java instance or with multiple java instances. For example, for stability and failover reasons usually clusters with at least 2 java instances are configured, but some applications do not support clustering, e.g., distributed locking or distributed cashes are not implemented and so on.

Given that:

  • Framework Space Objects [MB] is measured

  • Framework Space Classes [MB] is calculated using formula F1

  • Processing Space [MB] is calculated using formula F2

  • User Sessions Space [MB] is calculated using formula F3


Calculate and configure the Java VM instances as follows:






















Calculate Configure
Heap[MB] = Framework Space Objects[MB] + (Processing Space[MB] + User Sessions Space[MB])/Nr_JVM_Instances -Xms=Heap[MB]
-Xmx=Heap[MB]
YG[MB]=Processing Space [MB]/Nr_JVM_Instances -XX:NewSize=YG[MB]
-XX:MaxNewSize=YG[MB]
Meta[MB]=Framework Space Classes[MB] -XX:MetaspaceSize=Meta[MB]
-XX:MaxMetaspaceSize=Meta[MB]
or up to jvm7
-XX:PermSize=Meta[MB]-XX:MaxPermSize=Meta[MB]


Physical memory [MB]=Nr_JVM_Instances * (Heap [MB] + 1.8*Meta [MB] + 240)

The coefficient 1.8, which is applied to Meta[MB] reflects the default jvm calculated size for the Compressed Classes Space, based on the Metaspace size; the offset 240 MB is JVM standard on 64bit.

Examples


Example 1


Application measurements:

  • Framework Space Objects [MB] = 200 MB

  • Framework Space Classes [MB] = 100 MB

  • UserSessionMemory[MB] per user = 20 MB

  • ProcessingMemory[MB] per request = 50 MB


Sizing-relevant parameter choices:

  • Isgc=1 s (average server-side processing time is sub-second)

  • Ifgc=1800 s (a typical user session duration is 30 minutes)

  • Nr_JVM_Instances=1


Assumptions on customer usage:

  • 100 new user logins in 30 minutes (Ifgc=1800 s) will trigger on average 5 requests per second (Isgc=1s)


Calculation

Applying (F2): Processing Space [MB] = 2 * 5 *50 = 500

Applying (F3): User Sessions Space [MB] = 1.5 * 100 * 20 = 3000

Heap[MB] = 200 + (500 + 3000)/1 = 3700   »»»  -Xms=3700M and -Xmx=3700M

YG[MB] =500/1 = 500   »»»  -XX:NewSize=500M and -XX:MaxNewSize=500M

Meta[MB] = 100   »»»  -XX:MetaspaceSize=100M and -XX:MaxMetaspaceSize=100M

Physical memory [MB] = 1 * (3700 + 1.8*100 + 240) = 4120[MB]

 

Example 2


Let this be the same application, but now we want to load balance to 3 java instances.

  • Nr_JVM_Instances=3


Calculation

Heap[MB]= 200 + (500 + 3000)/3 = ~1370   »»»   -Xms=1370M and -Xmx=1370M

YG[MB]=500/3 = ~170   »»»   -XX:NewSize=170M and -XX:MaxNewSize=170M

Meta[MB]= 100   »»»    -XX:MetaspaceSize=100M and -XX:MaxMetaspaceSize=100M

Physical memory [MB] = 3 * (1370 + 1.8*100 + 240) = 3 * 1790 = 5370[MB]
Note!
Comparing the sizing result from Example 1 and Example 2 shows that multiple java instances, which are sized to handle same number of users and requests, require smaller heap size each, but more total physical memory, required to run all instances: for running the load with 1 java instance it is 4120 MB and with 3 java instances it is 5370 MB.

 

Example 3


A stateless application is accessed with http API calls. The API calls execute expensive parsing of large XML body, and the processing memory allocation is high.

The application measurements are as follow:

  • Framework Space Objects [MB] = 400 MB

  • Framework Space Classes [MB] = 50 MB

  • UserSessionMemory[MB] per user = 0 MB

  • ProcessingMemory[MB] per request = 1200 MB


Sizing-relevant parameter choices:

  • Isgc=2s (average server-side processing time is 2 seconds)

  • Nr_JVM_Instances=1


Assumptions on customer usage:

  • On average 20 API calls in 2 seconds (Isgc=2s)


Calculation

Applying (F2): Processing Space [MB] = 2 * 20 * 1200 = 48000

Applying (F3):not relevant

Heap[MB] = 400 + 48000/1 = 48400   
Stop! A ~48 GB heap size is still possible but not really feasible, therefore it will be difficult to handle the expected load with only 1 java instance.

Let’s change the preferred number of instances, for example to Nr_JVM_Instances=8.

Heap[MB] = 400 + 48000/8 = 400 + 6000 = 6400   »»»    -Xms=6400M and -Xmx=6400M

YG[MB]=48000/8 = 6000   »»»    -XX:NewSize=6000M and -XX:MaxNewSize=6000M

Meta[MB] = 50   »»»   -XX:MetaspaceSize=50M and -XX:MaxMetaspaceSize=50M

Physical memory [MB] = 8 * (6400 + 1.8*50 + 240) = 53120[MB]
One conclusion from Example 3 is that the ProcessingMemory per API call should be significantly optimized.

But still, we made sizing calculation with choosing Isgc=2s and Nr_JVM_Instances=8. Even if the average server processing time is on average 2 seconds, technically in this concrete XML parsing functionality most objects, allocated during this processing, live shorter than 2 seconds.

Let’s recalculate with Isgc=1 s. If there will be 20 API calls in 2 seconds, there will be 10 API calls in 1 second.

Applying (F2): Processing Space [MB] = 2 * 10 * 1200 = 24000

Applying (F3😞 not relevant

Heap[MB] = 400 + 24000/8 = 400 + 3000 = 3400   »»»   -Xms=3400M and -Xmx=3400M

YG[MB] = 24000/8 = 3000   »»»    -XX:NewSize=3000M and -XX:MaxNewSize=3000M

Meta[MB] = 50   »»»   -XX:MetaspaceSize=50M and -XX:MaxMetaspaceSize=50M

Physical memory [MB] = 8 * (3400 + 1.8*50 + 240) = 8 *3730 = 29840[MB]

This cluster configuration and instance size looks nicer, thus:
Another conclusion from Example 3 is that the choice of Isgc and Ifgc should be set to the minimal values which are affordable for the concrete functionality, and that increasing Nr_JVM_Instances in some situations is unavoidable.
6 Comments