Analyze Metaspace with jcmd VM.metaspace
A post in the Metaspace series:
With JDK 11, we added a new diagnostic command to jcmd
: VM.metaspace.
This command is very useful to analyze Metaspace consumption. So, lets dive in and use this to revisit our Little WildFly Server That Could from previous articles. We describe the command output and options and how to use it to spot typical waste points.
VM.metaspace, introduced with JDK-8201572 - courtesy of SAP and Red Hat - is a new addition to the swiss army knife that is jcmd
. Like the other diagnostic commands in that collection you call it like this: jcmd <pid or process name> VM.metaspace
.
$ jcmd wildfly help VM.metaspace
17680:
VM.metaspace
Prints the statistics for the Metaspace
Impact: Medium: Depends on number of classes loaded.
Permission: java.lang.management.ManagementPermission(monitor)
Syntax : VM.metaspace [options]
Options: (options must be specified using the <key> or <key>=<value> syntax)
basic : [optional] Prints a basic summary (does not need a safepoint). (BOOLEAN, false)
show-loaders : [optional] Shows usage by class loader. (BOOLEAN, false)
show-classes : [optional] If show-loaders is set, shows loaded classes for each loader. (BOOLEAN, false)
by-chunktype : [optional] Break down numbers by chunk type. (BOOLEAN, false)
by-spacetype : [optional] Break down numbers by loader type. (BOOLEAN, false)
vslist : [optional] Shows details about the underlying virtual space. (BOOLEAN, false)
vsmap : [optional] Shows chunk composition of the underlying virtual spaces (BOOLEAN, false)
scale : [optional] Memory usage in which to scale. Valid values are: 1, KB, MB or GB (fixed scale) or "dynamic" for a dynamically choosen scale. (STRING, dynamic)
Basic information
Used without parameters the command will print out a brief standard statistic.
Example: Again our started WildFly 16.0.0 standalone instance, running on SapMachine 11, no running apps:
$ jcmd wildfly VM.metaspace
31997:
Total Usage ( 1041 loaders):
Non-Class: 2837 chunks, 58,62 MB capacity, 53,54 MB ( 91%) used, 4,90 MB ( 8%) free, 2,59 KB ( <1%) waste, 177,31 KB ( <1%) overhead, deallocated: 5065 blocks with 1,01 MB
Class: 1653 chunks, 9,93 MB capacity, 7,44 MB ( 75%) used, 2,40 MB ( 24%) free, 208 bytes ( <1%) waste, 103,31 KB ( 1%) overhead, deallocated: 653 blocks with 285,77 KB
Both: 4490 chunks, 68,55 MB capacity, 60,98 MB ( 89%) used, 7,29 MB ( 11%) free, 2,79 KB ( <1%) waste, 280,62 KB ( <1%) overhead, deallocated: 5718 blocks with 1,29 MB
Virtual space:
Non-class space: 60,00 MB reserved, 58,75 MB ( 98%) committed
Class space: 248,00 MB reserved, 10,00 MB ( 4%) committed
Both: 308,00 MB reserved, 68,75 MB ( 22%) committed
Chunk freelists:
Non-Class:
specialized chunks: 1, capacity 1,00 KB
small chunks: 11, capacity 44,00 KB
medium chunks: (none)
humongous chunks: (none)
Total: 12, capacity=45,00 KB
Class:
specialized chunks: (none)
small chunks: 2, capacity 4,00 KB
medium chunks: (none)
humongous chunks: (none)
Total: 2, capacity=4,00 KB
Waste (percentages refer to total committed size 68,75 MB):
Committed unused: 156,00 KB ( <1%)
Waste in chunks in use: 2,79 KB ( <1%)
Free in chunks in use: 7,29 MB ( 11%)
Overhead in chunks in use: 280,62 KB ( <1%)
In free chunks: 49,00 KB ( <1%)
Deallocated from chunks in use: 1,29 MB ( 2%) (5718 blocks)
-total-: 9,06 MB ( 13%)
MaxMetaspaceSize: 256,00 MB
InitialBootClassLoaderMetaspaceSize: 4,00 MB
UseCompressedClassPointers: true
CompressedClassSpaceSize: 248,00 MB
That is a lot of information. Lets walk through this, shall we?
In case you are very busy or easily bored, feel free to skip ahead to the tl;dr section where I summarize the most important aspects of this output. You will miss out on a lot of context though.
In-Use-Chunks Section
The first section shows us information about the chunks in use by active class loaders:
Total Usage ( 1041 loaders):
Non-Class: 2837 chunks, 58,62 MB capacity, 53,54 MB ( 91%) used, 4,90 MB ( 8%) free, 2,59 KB ( <1%) waste, 177,31 KB ( <1%) overhead, deallocated: 5065 blocks with 1,01 MB
Class: 1653 chunks, 9,93 MB capacity, 7,44 MB ( 75%) used, 2,40 MB ( 24%) free, 208 bytes ( <1%) waste, 103,31 KB ( 1%) overhead, deallocated: 653 blocks with 285,77 KB
Both: 4490 chunks, 68,55 MB capacity, 60,98 MB ( 89%) used, 7,29 MB ( 11%) free, 2,79 KB ( <1%) waste, 280,62 KB ( <1%) overhead, deallocated: 5718 blocks with 1,29 MB
1041 loaders are alive in total.
Non-Class:
, Class:
- these lines list chunk usage for non-class space and class space. Lets ignore them for now and instead check out the next line:
Both:
- summarizes chunk usage for both spaces, and hence for the whole VM. Here we see that our 1041 loaders together use 4490 chunks with a total “capacity” of 68.55 MB.
Capacity is the sum of all space handed out to class loaders. That memory is bound to one class loader, but not necessarily space used for metadata, since - remember? - we give the class loader more than it needs. The next numbers shed more light on the difference:
60,98 MB ( 89%) used, 7,29 MB ( 11%) free, 2,79 KB ( <1%) waste, 280,62 KB ( <1%) overhead
To understand these numbers, remember that a class loader (ClassLoaderMetaspace) holds a list of in-use chunks. It has one current chunk, which is used to satisfy future allocations, and any number of “retired” chunks, which are (almost) fully used:
capacity = used + free + waste + overhead.
In our WildFly example, of the 68.55 MB Metaspace given to class loaders (“capacity”) only 60,98 MB (89%) are actually used (“used”). The rest is divided among:
- “free”: Unused space in the current chunk is called “free”. This space could still be used should this loader happen to load more classes. However, if the loader finished loading, this space is wasted.
- “waste”: Unused space in non-current chunks is called “waste”: When the current chunk is not large enough to satisfy a memory request, a new chunk is allocated and the current chunk is “retired”. That remainder space is waste. However, the JVM goes through some pains to use that attempt to reuse that memory, therefore this number should be very small.
- “overhead”: Chunks have headers. These headers incur a certain overhead. It is usually very small.
Furthermore we have this:
deallocated: 5718 blocks with 1,29 MB
This is Metaspace given back to the VM prematurely - before the allocating loader was unloaded. This happens rarely. It may happen when a class is redefined and parts of their old metadata are now obsolete. It also may happen when the VM encounters problems during class loading and stops loading this class, but had already allocated Metaspace for parts of it.
The VM attempts to salvage those deallocated blocks, but with limited enthusiasm. Which is perfectly fine since those cases should be quite rare and the waste caused by them small.
Virtual Space Section
The next section lists the sum of virtual spaces in use by the VM for Metaspace purposes:
Virtual space:
Non-class space: 60,00 MB reserved, 58,75 MB ( 98%) committed
Class space: 248,00 MB reserved, 10,00 MB ( 4%) committed
Both: 308,00 MB reserved, 68,75 MB ( 22%) committed
This is interesting since this is “the truth”, aka, “what the OS sees”. reserved is the total memory which has been reserved for Metaspace from the OS, and committed is obviously the portion which has been committed. These numbers include space actually used for class meta data and all types of waste which has been accrued.
It is normal that committed is larger than the capacity of in-use chunks, since it contains also free chunks kept in the freelists and the HWM margin - space proactively committed but not yet carved into Metachunks:
committed memory size = capacity-of-chunks-in-use + capacity-of-chunks-in-freelist + HWM margin.
More details can be found in Part 2 and Part 3 of this series.
Side Note: See how non-class space’ committed size closely tails its reserved size. This is because in non-class space we have a list of memory mappings and add to them as needed, so the size difference between reserved and committed cannot be larger than the size of one region (usually 2MB). Whereas for the class space we reserve the whole space (CompressedClassSize) upfront, and that shows in the reserved line for the Class space.
Chunk Freelist Section
This section shows how many chunks are waiting in the freelists to be reused:
Chunk freelists:
Non-Class:
specialized chunks: 1, capacity 1,00 KB
small chunks: 11, capacity 44,00 KB
medium chunks: (none)
humongous chunks: (none)
Total: 12, capacity=45,00 KB
Class:
specialized chunks: (none)
small chunks: 2, capacity 4,00 KB
medium chunks: (none)
humongous chunks: (none)
Total: 2, capacity=4,00 KB
This can be a significant part of Metaspace if we have high fragmentation (many class loaders alive in parallel, with a part of them dead and collected). In our case it looks completely harmless since the WildFly server did not unload any classes yet.
Waste Section
Arguably this is the most useful section in the whole output.
While developing the VM.metaspace command we wanted to be able to spot the most common problems at one glance. So the “Waste” section lists the various waste points:
Waste (percentages refer to total committed size 68,75 MB):
Committed unused: 156,00 KB ( <1%)
Waste in chunks in use: 2,79 KB ( <1%)
Free in chunks in use: 7,29 MB ( 11%)
Overhead in chunks in use: 280,62 KB ( <1%)
In free chunks: 49,00 KB ( <1%)
Deallocated from chunks in use: 1,29 MB ( 2%) (5718 blocks)
-total-: 9,06 MB ( 13%)
We usually have only two important parts:
- Free in chunks in use: This is the space which has already been alotted to class loaders but which remains unused. Note that strictly speaking this is not “waste” - the loaders could, in theory, continue loading classes and then would use this space. But if no more classes are loaded, this memory is indeed wasted.
See how this is the only points which matters for our little tame WildFly example: this waste points amounts to 7.29MB, about 11% of the total committed Metaspace Size.
- In free chunks: Sum of all chunks in the free lists. As explained above, this can grow when class loaders die and there is a lot of fragmentation.
The other waste items are typically less relevant:
Committed unused: That is space which has already been committed from the current VirtualSpaceNode but not yet carved up into chunks and handed out to a loader. Should normally be quite small.
Waste in chunks in use: Sum of the “waste” numbers in the Chunks-in-use section. Should be very small.
Overhead in chunks in use: Sum of the “overhead” numbers in the Chunks-in-use section. Should be very small.
Deallocated from chunks in use: Sum of the “deallocated” numbers in the Chunks-in-use section. Should be very small. If it is not, this may mean rampant class redefinition or a lot of failed class loading.
A pathological case
So far this has not been very exciting since our WildFly server is purring smoothly like a well behaved cat. Not much to see here in terms of memory waste. So lets look at a real pathological case:
InterleavedLoaders is a little example which demonstrates how the VM holds onto Metaspace memory even after the class loaders have been collected if the dead loaders are interleaved with life loaders in Metaspace.
It will create a number of class loaders and use them to load classes. These loaders are sorted into four groups, or “generations”, and we will unload one generation after another until only one generation remains. Since they were created in an interleaved fashion, the remaining life loaders will prevent the space for the dead loaders from being returned to the OS, since, remember: Metaspace memory is only released to the OS if a whole VirtualSpaceNode (usually 2MB) happens to be free.
Lets start this test program and proceed pressing keys it until three of four loader generations are unloaded:
$ java -cp ./repros/repros8/target/repros8-1.0.jar de.stuefe.repros.metaspace.InterleavedLoaders
Generating 100 classes...
Will load 4 generations of 100 loaders each, each loader loading 100 classes...
<press key>
After loading...
<press key>
Before freeing generation 1...
<press key>
After freeing generation 1.
<press key>
Before freeing generation 2...
<press key>
After freeing generation 2.
<press key>
Before freeing generation 3...
<press key>
After freeing generation 3.
<press key>
Now, lets take a look with jcmd (for brevity I only show the Waste section, full output here).
$ jcmd de.stuefe.repros.metaspace.InterleavedLoaders VM.metaspace
6918:
<cut>
Waste (percentages refer to total committed size 404,82 MB):
Committed unused: 116,00 KB ( <1%)
Waste in chunks in use: 2,95 KB ( <1%)
Free in chunks in use: 6,41 MB ( 2%)
Overhead in chunks in use: 219,69 KB ( <1%)
In free chunks: 275,21 MB ( 68%)
Deallocated from chunks in use: 1,29 MB ( <1%) (2227 blocks)
-total-: 283,24 MB ( 70%)
Interesting, yes? We see that of the committed size if 400MB, 275 MB - or almost 70% - are unused and retained in free lists. This shows clearly how Metaspace fragmentation can hurt - this memory is lost to the OS and as long as the VM does not reload classes it stays committed but unused.
To confirm, lets look at the Freelist section:
Chunk freelists:
Non-Class:
specialized chunks: 1, capacity 1,00 KB
small chunks: 1147, capacity 4,48 MB
medium chunks: 3844, capacity 240,25 MB
humongous chunks: (none)
Total: 4992, capacity=244,73 MB
Class:
specialized chunks: (none)
small chunks: 1190, capacity 2,32 MB
medium chunks: 901, capacity 28,16 MB
humongous chunks: (none)
Total: 2091, capacity=30,48 MB
There we go. All the memory is sitting in the free lists waiting to be reused but not returned to the OS.
Note: I am currently working on a prototype to reduce waste and memory footprint in Metaspace and to return memory more eagerly to the OS. For details, please see JDK-8221173.
tl;dr
If you did not read anything else, take away this:
jcmd <pid> VM.metaspace
displays basic Metaspace statistics- The
Virtual Space
section displays reserved and committed space used for all Metaspace purposes. This is what is used, in total, for Metaspace. It includes both space used for class metadata and overhead/waste. - The
Waste
section lists all types of overhead/wastages which can happen. Prominent wastages can be: Free in chunks in use - chunks only partially used by their loaders - and chunks parked in free lists for reuse, In free chunks.