So in case you miss JavaOne, where the colleagues will do a presentation on Friday, 9 May 2008, 11:30-12:30, and you speak german, this is your chance to get the latest information about the Memory Analyzer and to give us feedback.
We listened to the community and now offer downloads for all the major operating systems, including 64 bit versions and including Mac OS X!
One tip from my side :
The absolute "killer feature" is the "top consumers" query you can find under "queries->core".
This query will give you a global overview of where in your java packages the most memory is retained. And this query is fast, it usually takes only a few seconds even on big (Gigabytes) heap dumps ! This query makes an initial memory consumption analysis so easy, that it can be even done be non programmers, maybe even by managers ;)
Measuring and analysing the Memory consumption in Java and other OOP languages is not an easy task, because all the objects form a big highly connected graph. Only counting the flat, also called "shallow" size, of the objects, like for example the NetBeans profiler does, is not always helpful. If you have to figure out how much memory is spend by the several hundred applications/services running on your application server, you really need a more powerful tool to analyze a complete heap dump.
The question is then, whether there is a better way to measure memory consumption ?
Yes there is !
It's called "retained size" and it was pioneered (at least as far as I know) by the yourkit profiler, but is now also the main concept that the SAP Memory Analyzer has been released is build around
Definition of Retained Set/Size:
The retained set for a leading set of objects (e.g. all objects of a particular class or all objects of all classes loaded by a particular class loader or simply a bunch of arbitrary objects) is the amount of memory which would be freed if all objects of that leading set would became unreachable, i.e. the retained set includes these objects as well as all other objects only reachable through these objects. The retained size is the total heap size of all objects contained in the retained set.
So in short, the retained size for a set objects is the amount of memory, that would be freed, if those objects could be removed from memory and a full garbage collection cycle would have been run.
The retained set can also be computed for single objects. The SAP Memory Analyzer, has special support for this through the dominator tree.
Definition of Dominator/Dominator Tree:
A node d dominates a node n if every path from the start node to n must go through d. The transformation of the object reference graph into a dominator tree allows us to easily identify the biggest chunks of retained memory.
The dominator tree views you, which are the biggest (in terms of retained size) objects in your heap. It also shows you, why an object is big, by providing a hierarchical view of the objects hold/retained by the object.
Last week the first preview release of the SAP NetWeaver Composition Environment (SAP NetWeaver CE) became available. It can be downloaded from the Service market place. Update: Now the SAP Memory analyzer is available as a separate download from here !
At the core of SAP NetWeaver CE is the latest version of the SAP NetWeaver Application Server component. It is fully compatible with Java Platform, Enterprise Edition (Java EE) 5, including complete support for all the major Java technologies including Enterprise JavaBeans 3.0 (EJB 3.0) and the latest data persistence models. As an added bonus, the version of SAP NetWeaver Developer Studio integrated in SAP NetWeaver CE is based on the openstandard Eclipse 3.2 framework.
Last but not least SAP NetWeaver CE includes a little gem called “SAP Memory Analyzer” : SAP Memory Analyzer is a new tool for analyzing the memory consumption of your java application. I’m might be a bit biased, because I have been working closely together with the team that developed this tool, but I really believe that this tool will set a new standard for memory consumption analysis.
IMHO until now, memory consumption analysis using heap dumps of large Java enterprise applications was very difficult and time consuming, and often not very practical. If you ever tried to load a heap dump greater than 1Gbyte into JHAT tool for example, you will know what I’m talking about. It just would not work, unless you have a “monster” machine.
The unique features are :
provides powerful functions for finding the biggest objects in the heap, such as “retained size” and dominator tree analysis.
easy to use Eclipse RCP based interface.
Analyse heap dumps > 1Gbyte and with up to around 20 Million objects on a 32 bit machine and even bigger heap dumps on a 64 bit machine.
uses high performance algorithms, and indices, that speed up most operations after the heap dump has been initially parsed.
Comes with special SAP J2EE features, such as the ability to show the memory consumption for sessions and classloaders.
works with the built in heap dumps of the following VM’s : Sun JVMs (1.4.2_12 or higher and 1.5.0_07 or higher), HP-UX VM (1.4.2_11 or higher) and SAP JVMs (since 1.5.0)
you can download it for free
I prepared a short flash based walk through, that shows you the most important features. Go and check the URL at the top of this article.
This is only the first version of this tool. We already use a newer, much more powerful version, of this tool internally. So there’s hope, that you will get an even more powerful version within the near future :)
I my next blogs I will write about how memory consumption can be measured and analyzed.
Next week the JAX 2007 conference is taking place in Germany. JAX is maybe the most important Java conference in Germany. For those of you who will attend the conference, I can really recommend you to go to the Session "Java EE Application Server und der Java Heap – effektive Speicheranalyse" presented by two colleagues of mine. See also the link above.
You will learn, that we made quite a big step forward in analyzing memory consumption problems, compared to the tools that are on the market today.
Stay tuned, I will write here, about the topics covered in this presentation, in about 2 weeks.
Update: I just learned that there will also be a session at JavaOne (TS-21935). Unfortunately it seems to be already out booked.
Brian Goetzclaims that object allocation in Java is very fast and that therefore it doesn't make sense to preallocate or pool objects in Java. Brian is very well known in the java community for his contributions to the concurrency framework and therefore a lot of people will just believe what he has to say.
But in this case I have to disagree with him and I'm not alone.
Yes, object allocation in Java is very cheap, yes it's even cheaper than a malloc in C, which comes at no surprise because already years ago, a "new" in Smalltalk was faster than a malloc. But the time for allocation objects is not the complete cost,that you have to take into account.
Todays Garbage Collection algorithms are suprisingly efficent and often in a single thread test program you may never see huge delays because of GC activities. But in on a multithreaded server with hundreds of threads running, and several VM's running on more than one machine in a clustered enviroment, the situation is a different one.
First, with enough threads allocating objects at a high rate, you will run into the situation that more and more objects will be promoted to the so called old space, which will trigger a Full GC, which usually will stop the world for several seconds. These full GC's may stop activity on the whole cluster in case someone does a broadcast to all cluster nodes and waits for the result. We also know that a Full GC can have a non linear effect on performance, because of Implications of Java Garbage Collection for Multi-User Applications
You might also want to pool objects such as Strings, because you have to optimize memory consumption. If you read a String from the DB more than once, but this String is really some constant or unique identifier, you really don't want to hold duplicates of this String in memory.
So please be careful, if "someone" tells you that you don't really need to take care much about how many objects you allocate in your Java application.
Update: Kirk has posted a very well written comment on his blog here. I fully agree with his points.
This is my first post for this year. Yes I'm late, but I moved to new responsibilities in a central Netweaver performance team. That's the main reasons, why I didn't had the time to write new web logs.
Anyway I promise you that 2007 will be an exciting year for all those of you, who care about the performance of your java applications on the Netweaver stack. We will ship some features within the very near future that will support you in finding performance problems much more easily. IMHO at least some of these features are better than anything that is on the market right now. Sorry, I can't tell you any details yet (unless you work for SAP), but please stay tuned.
It should give you higher quality (or at least different ;) ) results for searches about netweaver and performance than standard google because it searches through a lot of resources on the web, that I found to be useful. It's a nice little experiment at least.
This time I will talk about a very popular pattern that can led to a big increase in memory consumption. You will learn also the rules for computing how much memory a Java objects needs.
The Pattern is called "Dynamic Properties" and is described here by Martin Fowler.
The key thing about fixed properties is that you fix them at design time, and all instances at run time must follow that decision. For some problems this is an awkward restriction. Imagine we are building a sophisticated contact system. There are some things that are fixed: home address, home and work phone, email. But they’re all sorts of little variations. For someone you need to record their parent’s address, another has a day work and evening work numbers. It’s hard to predict all these things in advance, and each time you change the system you have to go through compiling, testing, and distribution. To deal with this you need to use dynamic properties.
fixed properties are essential the same as Java Beans. A popular variant of the "Dynamic Properties" are "Flexible Dynamic Properties"
Provide an attribute parameterized with a string. To declare a property just use the string.
So let's check what the memory consumption overhead for implementing the "Flexible Dynamic Properties" pattern can be.
The general rules for computing the size of an object on the SUN/SAP VM are :
Arrays of boolean, byte, char, short, int: 2 * 4 (Object header) + 4 (length-field) + sizeof(primitiveType) * length -> align result up to a multiple of 8
Arrays of objects: : 2 * 4 (Object header) + 4 (length-field) + 4 * length -> align result up to a multiple of 8
Arrays of longs and doubles: : 2 * 4 (Object header) + 4 (length-field) + 4 (dead space due to alignment restrictions) + 8 * length
java.lang.Object: 2 * 4 (Object header)
other objects: sizeofSuperClass + 8 * nrOfLongAndDoubleFields + 4 * nrOfIntFloatAndObjectFields + 2 * nrOfShortAndCharFields + 1 * nrOfByteAndBooleanFields -> align result up to a multiple of 8
Arrays of boolean, byte, char, short, int: 2 * 8 (Object header) + 4 (length-field) + sizeof(primitiveType) * length -> align result up to a multiple of 8
Arrays of objects: : 2 * 8 (Object header) + 4 (length-field) + 4 (dead space due to alignment restrictions) + 8 * length
Arrays of longs and doubles: : 2 * 8 (Object header) + 4 (length-field) + 4 (dead space due to alignment restrictions) + 8 * length
java.lang.Object: 2 * 8 (Object header)
other objects: sizeofSuperClass + 8 * nrOfLongDoubleAndObjectFields + 4 + nrOfntAndFloatFields + 2 * nrOfShortAndCharFields + 1 * nrOfByteAndBooleanFields -> align result up to a multiple of 8
Note that an object might have unused space due to alignment at every inheritance level (e.g. imagine a class A with just a byte field and class B has A as it's superclass and declares a byte field itself -> 14 bytes 'wasted on 64 bit system).
Thanks to my colleague Ralf Sch. for letting me know these rules.
Assume a simple implementation of the "Flexible Dynamic Properties" pattern using a HashMap. Each Property is stored as a key value pair. We want to compute how much memory a simple object with nothing else than 4 properties needs.
An empty HashMap (on JDK 1.4 SUN Intel 32 bit) consumes 120 Bytes. Each HashMapEntry has 24 bytes overhead. So with 4 properties the object needs 8+120+4 * 24=224 bytes, when all keys and values are not counted. If you would use static Properties,the same would only cost 8+4 * 4=24 bytes.
Note that you almost always want, that at least the keys are shared. The simple reason is that usually the number of keys is small and independent of the number of objects that have properties. Keys of Hashmaps are not allowed to be changed anyway. In case you are using Strings changing the Strings is tricky, because Strings are immutable. So there's really no good reason to not share the Strings in this case.
You will not really scale, when you dont' share the keys, which can easily happen when you read the keys from the database and don't check whether you already have seen the key (using a Hashmap for example).
Duplicating strings not only wastes memory it may also make access to HashMaps slower. Take a look at the sources for String.equals() :
equals() first compares for identity ! So in case of the "Flexible Dynamic Properties", if you don't share the keys you also slow down access to your properties, because if the Strings to be compared are equal, but not identical ,the code in the JDK will compare the Strings character by character. This can be, depending on the length of the String, 5 or more times slower.
This is kind of a reply to this The First Law of Optimization entry by Valery,Silaev, who obviously likes to read my blog. I also would like to explain im more detail, why I think knowing the String api is pretty important.
"Empty String in JVM occupies 40 bytes in memory? Wow, that's cool to know!"
Sure this is important to know. My experience is that a lot of developers, do not know what the overhead of String is. This is not just a feeling, but it's supported by a lot of data taken from heap dumps and of course by speaking to the developers.
This has nothing to do with premature optimization, which as we all know, who read Knut every evening as a bedtime stories for our childs "is the root of all evil (or at least most of it) in programming."
It's about getting the basic things right, and fact is that the class String tends to be overused and pretty often is used incorrectly .
"The Expensive Compiler Story"
Of course I'm aware, that there are certain situations, in which the compiler optimizes away the overhead of "+", and I was simplifying because I didn't want to create a big blog entry with 154 lines.
But honestly, if you tell the developer to use StringBuffer on JDK 1.4 and StringBuilder on JDK 1.5, and the developer has to maintain code for both JDK. What do you think most developers will do ? I guess they will just go ahead with StringBuffer, at least in code that is unlikely to be performance critical. As a matter of fact, on an recent JVM, uncontented synchronization is almost always practically zero.
Telling a developer,"just test everything by yourself", isn't really practical nor efficient, because it's just duplicated work. Of course you want to profile/measure your complete application, to be sure that you don't have any unexpected bottlenecks.
It Is String.intern() good or evil ?, String.intern() should rather not be used, at least on SUN VM's up to JDK 1.5. But maybe a JVM of a large software company in Walldorf, will improve the situation within the near future ;) By the way using "StringA == StringB" rather than "StringA.equals(StringB)" will usually not make a difference, because the first statement in String.equals does "StringA == StringB" anyway.
Yes reflection performance improved a lot on most VM's. I'm not sure whether it's fast on all VM's. One problem with reflection at least on SUN VM's is that classes are generated on the fly, which occupy permspace. Ever wondered about messages like this in your std_server0.out ?
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor51] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor143] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor93] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor211] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor207] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor47]
These messages are caused by the unloading (due to a Full GC), of the automatically generated classes for reflection. As far as I remember there was a problem with a non SUN VM, where this would cause very long GC pauses (minutes), because the VM was not optimized for this kind of stuff. And also reflection got pretty fast in the meantime, it can still be infinitely slower than a simple call, because it defeats inlining.
"Do not optimize object construction"
I do not fully agree with this statement. If this means don't care about how much objects you created, then I disagree. Sure object creation in Java is freaking fast, almost always faster than a malloc in C and the clean up through the minor GC is very efficient. But the problem is that you must be carefull about how long your objects live, because if they live to long they will go to old space and can only be reclaimed by a full GC, which is costly.
Predicting how long an object lives can be very difficult in an complex J2EE enviroment. It also doesn't make much sense to create lot's of duplicates of objects which are really constants. More about this in later blog. For more information about Garbage Collection in Java (1.4) check this link
@Valery,Silaev. I like to read your blog too :) In fact I agree in principle with most of your statements.
Some comments about my last blog. I’m speaking about JDK 1.4.2 here, because this is what is used today by Netweaver04(s). The rules for JDK 1.5,or JDK 1.6 are different, the general rule being, that there’s less sharing of char’s in the newer JDK’s. Finding out the exact rules for the newer JDK’s is left as an exercise for the reader ;)
Another important function in String is substring(). If you call substring on an existing object this will result in a new String object being created (Strings are immutable), which shares the char of the existing String.
This is in general a good thing, because it allows you to save some memory. If you have for example a path of a file name like “/netweaver/is/great.txt”, then you can have one String object for the whole path and construct another String object for the filename “great.txt” that shares the existing char.
But, you can also easily shoot yourself into the feet :
String fileName="a picture with pretty long file name.jpg";int length=fileName.length();
String fileType=fileName.substring(length-3,length);return fileType;
The intent of this “beautiful” code is to get the type of the file. The problem with this code is that the original String referenced by “fileName” will not be referenced anymore after the method returns the file type. You will end up with a String being returned with the char “a picture with pretty long file name.jpg”, which is 41 characters versus the 3 characters you really need for “jpg”.
There are even people out there that think, that this is a bug, but as I tried you to explain this is really a feature that can help you to reduce memory consumption.
UPDATE: as Frank pointed out (see comments below) this is even more confusing, than I thought in the first place. There are 2 similiar constructors in String : public String(char value, int offset, int count) and String(int offset, int count, char value) The later is called by substring.