http://www.dynatrace.com/en/javabook/the-three-jvms.html
Not all JVMS are Created Equal
Chapter: Memory Management
- Intro Page
-
- How Garbage Collection Works
- The Impact of Garbage Collection on application performance
- Reducing Garbage Collection Pause time
- Making Garbage Collection faster
- Not all JVMS are created equal
- Analyzing the Performance impact of Memory Utilization and Garbage Collection
- Tuning
- GC Configuration Problems
- The different kinds of java memory leaks and how to analyse them
- High Memory utilization and their root causes
- Classloader releated memory issues
- Out-Of-Memory, Churn Rate and more
-
- Introduction to Performance Monitoring in virtualized and Cloud Environments
- IaaS, PaaS and Saas – All Cloud, All different
- Virtualization's Impact on Performance Management
- Monitoring Applications in Virtualized Environments
- Monitoring and Understanding Application Performance in The Cloud
- Performance Analysis and Resolution of Cloud Applications
Many developers know only a single JVM (Java Virtual Machine), the
Oracle HotSpot JVM (formerly Sun JVM), and speak of garbage collection
in general when they are referring to Oracle's HotSpot implementation
specifically. It may seem as though there is an industry default, but
such is not the case! In fact, the two most popular application servers,
IBM WebSphere and Oracle WebLogic, each come with their own JVM. In
this section, we will examine some of the enterprise-level
garbage-collection specifics of the three most prominent JVMs, the
Oracle HotSpot JVM, the IBM WebSphere JVM, and the Oracle JRockit JVM.
Objects are allocated in the Eden space, which is always considerably
larger than the survivor spaces (default ratio is 1:8, but can be
configured). The copy GC algorithm is executed either single-threaded or
in parallel. It always copies surviving objects from the Eden, and
currently used survivor into the second (currently unused) survivor
space. Copying an object is, of course, more expensive than simply
marking it, which is why the Eden space is the biggest of the three
young-generation spaces. The vast majority of objects die in their
infancy. A bigger Eden will ensure that these objects will not survive
the first GC cycle, and thus not be copied at all. Once an object has
survived multiple GC cycles (how many can be configured) it is tenured
to the old generation (the old generation is also referred to as tenured
space).
It is, of course, entirely possible for objects to tenure prematurely. The size and ratio of the areas has a big influence on allocation speed, GC efficiency, and frequency and depends completely on the application behavior. The optimal solution can be found only by applying this knowledge and a lot of testing.
The old generation can be configured to use either a serial (default), parallel, or concurrent garbage collection. The parallel collector is also a compacting one, but the way it does the compaction can be configured with a wide variety of options. The concurrent GC, on the other hand, does not compact at all. This can lead to allocation errors due to fragmentation (no large-enough blocks available). If this happens the CMS triggers a full GC which effectively uses the normal GC (and collects the young generation as well).
All these combinations lead to a variety of options and configuration possibilities and can make finding an optimal GC configuration seem quite complicated. In the Tuning Section later in this chapter, we will cover the most important optimization: determining the optimal size for the young and old generations. This optimization ensures that only long-lived objects get tenured, while avoiding too many young-generation GCs and thus too many suspensions.
The HotSpot JVM has an additional unique feature, called the permanent generation, to help make the garbage-collection process more efficient. Java maintains the application code and classes as objects within the heap. For the most part these objects are permanent and need not be considered for garbage collection. The HotSpot JVM therefore improves the garbage-collection performance by placing class objects and constants into the permanent generation. It effectively ignores them during regular GC cycles.
For better or for worse, the proliferation of application servers, OSGi containers, and dynamically-generated code has changed the game, and objects once considered permanent are not so permanent after all. To avoid out-of-memory errors, the permanent generation is garbage-collected during a major, or full, GC only.
In the Problem Pattern Section below, we will examine the issue of out-of-memory errors in the permanent collection, which wasn't designed to handle modern use cases. Today's application servers can load an amazing number of classes&emdash;often more than 100,000, which pretty much busts whatever default allocation has usually been set. We will also discuss the memory-leak problems caused by dynamic bytecode libraries.
A detailed description of the memory managements for this JVM can be found here.
Note: For further information, here's a link to a detailed description of Oracle HotSpot JVM memory management.
The heap is divided into a number of fixed subareas. A list with references to objects in the area, called a remember set, is kept for each subarea. Each thread then informs the GC if it changes a reference, which could cause a change in the remember set. If a garbage collection is requested, then areas containing the most garbage are swept first, hence garbage first. In the best case (likely the most common, as well), an area will contain no living objects and is simply defined as free&emdash;no annoying mark-and-sweep process and no compacting. In addition, it is possible to define targets with the G1 collector, such as overhead or pause times. The collector then sweeps only as many areas as it can in the prescribed interval.
In this way, G1 combines the advantages of a generational GC with the flexibility of a continuous garbage collector. G1 also supports thread-local allocation and thus combines the advantages of all three garbage collection methods we've discussed&emdash;at least theoretically.
For instance, where a generational heap helps find the the correct size for the young generation, often a source of problems, G1 is intended to obviate this sizing problem entirely. However, Oracle has indicated that G1 is not yet ready for production, and still considers the G1 experimental and not ready for production use.
The layout of the generational heap (see Figure 2.9) is slightly
different from that of the Oracle HotSpot JVM. The WebSphere nursery is
equivalent to the HotSpot young generation, and the WebSphere tenured
space is equivalent to the HotSpot old generation. The nursery is
divided into two parts of equal size&emdash;allocate and survivor
areas. Objects are always allocated in the nursery and copy garbage
collection is used to copy surviving objects to the survivor area. After
a successful GC cycle, the former survivor area becomes the nursery.
The IBM WebSphere JVM omits the Eden space and does not treat infant objects specially. It does, however, differentiate between small and large objects. Large objects, usually more than 64k, are allocated in a specific area for the non-generational heap or directly in the tenured generation space. The rationale is simple. Copying (generational GC) or moving (compacting) large objects is more expensive than considering them during the marking phase of a normal GC cycle.
And unlike the HotSpot JVM, the WebSphere JVM treats classes like any other object, placing them in the "normal" heap. There is no permanent generation and so classes are subjected to garbage collection every time. Under certain circumstances when classes are repeatedly reloaded, this can lead to performance problems. Examples of this can be found in the Problem Pattern Section below.
The keep area leads to the following object allocation and garbage collection semantics:
While the JRockit handles the young generation differently, The the tenured space is then using the same either a parallel or concurrent GC strategies as the two other JVMs.
The following are some important points that distinguish the JRockit JVM from others:
To avoid this, the distributed garbage collector (RMI garbage collector) forces a major client garbage collection (with all the negative impact on performance) at regular intervals. This interval is controlled using the GCInterval system property.
The same setting exists for the server side and do the same thing. (Until Java 6, both settings defaulted to a minute. In Java 6, the server-side default changed to one hour.) It effectively triggers a major garbage collection every minute, a performance nightmare. The setting makes sense in general on the client side (to allow the server to remove remote objects), but it's unclear why it exists on the server side. A server remote object is freed for garbage collection either when the lease is up or when the client explicitly cleans it. The explicit garbage collection has no impact on this, which is why I recommend setting this property as high as possible for the server.
I also recommend that RMI be restricted to stateless service interfaces. Since there would exist only one instance of such a server interface and it would never need to be garbage-collected (or at least as long as the application is running), we do not need remote garbage collection to remove it. If we restrict RMI in this way, we can also set the client-side interval very high and effectively remove the distributed garbage collector from our equation by negating its impact on application performance.
Sun originally specified the Java Real-Time System (Java RTS; see JSR-1 and JSR-282) with a specific real-time garbage collector called Henriksson GC that attempts to ensure strict thread scheduling. The algorithm is intended to make sure garbage collection does not occur while critical threads (defined by priority) are executing tasks. However, it is a best-effort algorithm, and there is no way to guarantee that no critical threads are suspended.
In addition, the Java RTS specification includes scoped and immortal memory areas. A scope is defined by marking a specific method as the start of a scoped memory area. All objects allocated during the execution of that method are considered to be part of the scoped memory area. Once the method execution has finished, and thus the scoped memory area is left, all objects allocated within it are simply considered deleted. No actual garbage collection occurs, objects allocated in a scoped memory area are freed, and all used memory is reclaimed immediately after the defined scope has been exited.
Immortal objects, objects allocated via the immortal memory area, are never garbage collected, an enormous advantage. As such, they must never reference scoped objects, which would lead to inconsistencies because the scoped object will be removed without checking for references.
These two capabilities give us a level of memory control that is otherwise not possible in Java, which allows us to minimize the unpredictable impact of the GC on our response time. The disadvantage is that this is not part of the standard JDK, so it requires a small degree of code change and an intrinsic understanding of the application at hand.
The IBM WebSphere and Oracle JRockit JVMs both provice real-time garbage collectors. IBM promotes its real-time garbage collector by guaranteeing ≤1 ms pause time. Oracle JRockit provides a deterministic garbage collector, in which the maximum GC pause can be configured. Other JVMs, such as Zing, from Azul Systems, try to solve this issue by completely removing the stop-the-world event from the garbage collector. (There are a number of Real Time Java implementations available).
Oracle Hotspot JVM (formerly known as the Sun JVM)
The Oracle HotSpot JVM uses a generational garbage-collection scheme exclusively (see Figure 2.8). (We'll discuss Oracle's plans to implement a G1, Garbage First, collector below.)It is, of course, entirely possible for objects to tenure prematurely. The size and ratio of the areas has a big influence on allocation speed, GC efficiency, and frequency and depends completely on the application behavior. The optimal solution can be found only by applying this knowledge and a lot of testing.
The old generation can be configured to use either a serial (default), parallel, or concurrent garbage collection. The parallel collector is also a compacting one, but the way it does the compaction can be configured with a wide variety of options. The concurrent GC, on the other hand, does not compact at all. This can lead to allocation errors due to fragmentation (no large-enough blocks available). If this happens the CMS triggers a full GC which effectively uses the normal GC (and collects the young generation as well).
All these combinations lead to a variety of options and configuration possibilities and can make finding an optimal GC configuration seem quite complicated. In the Tuning Section later in this chapter, we will cover the most important optimization: determining the optimal size for the young and old generations. This optimization ensures that only long-lived objects get tenured, while avoiding too many young-generation GCs and thus too many suspensions.
The HotSpot JVM has an additional unique feature, called the permanent generation, to help make the garbage-collection process more efficient. Java maintains the application code and classes as objects within the heap. For the most part these objects are permanent and need not be considered for garbage collection. The HotSpot JVM therefore improves the garbage-collection performance by placing class objects and constants into the permanent generation. It effectively ignores them during regular GC cycles.
For better or for worse, the proliferation of application servers, OSGi containers, and dynamically-generated code has changed the game, and objects once considered permanent are not so permanent after all. To avoid out-of-memory errors, the permanent generation is garbage-collected during a major, or full, GC only.
In the Problem Pattern Section below, we will examine the issue of out-of-memory errors in the permanent collection, which wasn't designed to handle modern use cases. Today's application servers can load an amazing number of classes&emdash;often more than 100,000, which pretty much busts whatever default allocation has usually been set. We will also discuss the memory-leak problems caused by dynamic bytecode libraries.
A detailed description of the memory managements for this JVM can be found here.
Note: For further information, here's a link to a detailed description of Oracle HotSpot JVM memory management.
Garbage First (G1)
Oracle's Java 7 will implement G1 garbage-collection (with backports to Java 6), using what is known as a garbage first algorithm. The underlying principle is very simple, and it is expected to bring substantial performance improvements. Here's how it works.The heap is divided into a number of fixed subareas. A list with references to objects in the area, called a remember set, is kept for each subarea. Each thread then informs the GC if it changes a reference, which could cause a change in the remember set. If a garbage collection is requested, then areas containing the most garbage are swept first, hence garbage first. In the best case (likely the most common, as well), an area will contain no living objects and is simply defined as free&emdash;no annoying mark-and-sweep process and no compacting. In addition, it is possible to define targets with the G1 collector, such as overhead or pause times. The collector then sweeps only as many areas as it can in the prescribed interval.
In this way, G1 combines the advantages of a generational GC with the flexibility of a continuous garbage collector. G1 also supports thread-local allocation and thus combines the advantages of all three garbage collection methods we've discussed&emdash;at least theoretically.
For instance, where a generational heap helps find the the correct size for the young generation, often a source of problems, G1 is intended to obviate this sizing problem entirely. However, Oracle has indicated that G1 is not yet ready for production, and still considers the G1 experimental and not ready for production use.
The IBM WebSphere JVM
As of Java 5, the IBM WebSphere JVM has added generational GC configuration option to its classic mark-and-sweep algorithm. The default setup still uses a single big heap with either a parallel or a concurrent GC strategy. This is recommended for applications with a small heap size, not greater than 100 MB, but is not suitable for large or more-complex applications, which should use the generational heap.The IBM WebSphere JVM omits the Eden space and does not treat infant objects specially. It does, however, differentiate between small and large objects. Large objects, usually more than 64k, are allocated in a specific area for the non-generational heap or directly in the tenured generation space. The rationale is simple. Copying (generational GC) or moving (compacting) large objects is more expensive than considering them during the marking phase of a normal GC cycle.
And unlike the HotSpot JVM, the WebSphere JVM treats classes like any other object, placing them in the "normal" heap. There is no permanent generation and so classes are subjected to garbage collection every time. Under certain circumstances when classes are repeatedly reloaded, this can lead to performance problems. Examples of this can be found in the Problem Pattern Section below.
Oracle JRockIt
Oracle's Weblogic Application Server uses the Oracle JRockit JVM, and, like the IBM WebSphere JVM, can use a single continuous heap or generational GC. Unlike the other two other JVMs we've discussed, JRockit does not use a copy garbage collection strategy within the nursery. It simply declares a block within the nursery (size is configurable and the placement changes after every GC) as a keep area (see Figure 2.10).- Objects are first allocated anywhere outside the keep area. Once the nursery fills up, the keep area gets used as well.
- The keep area automatically contains the most-recently allocated objects once the GC is triggered.
- All live objects outside the keep area are promoted to the tenured space. Objects in the keep area are considered alive and left untouched.
- After the GC, the nursery is empty apart from objects within the former keep area. A new equally-sized memory block within the nursery is now declared as a keep area and the cycle starts again.
- An object is never copied more than once.
- Recently allocated objects are too young to tenure and most likely alive, and thus simply left untouched.
While the JRockit handles the young generation differently, The the tenured space is then using the same either a parallel or concurrent GC strategies as the two other JVMs.
The following are some important points that distinguish the JRockit JVM from others:
- Thread-local allocation (TLA) is active in default (which we'll discuss in the next section of this chapter) and is part of the nursery. of the nursery.
- JRockit distinguishes between small and large objects, with large objects allocated directly to the old generation.
- Classes are considered normal objects and are placed on the heap and subject to garbage collection (which is also true of the IBM WebSphere JVM).
"Special" Garbage Collection Strategies
There are some situations when standard garbage collection is not sufficient. We will examine the Remote Garbage Collector, which deals with distributed object references, and the Real Time Garbage Collector, which deals with real-time guarantees.Remote Garbage Collector
With a Remote Method Invocation (RMI) we can use a local Java object (client-side stub) represent another object residing on a different JVM (server-side). Obviously, RMI calls to the server-side object require that this object exists. Therefore RMI makes it necessary to consider the server-side object being referenced by the client-side stub. Since the server has no way of knowing about this reference, we need remote garbage collection remedies. Here's how it works:- When a client receives a stub from a server, it acquires a lease for it. The server side object is considered referenced by the client stub.
- A server-side object is being kept alive by the RMI implementation itself until the lease expires, which is a simple timeout.
- Existing client side stubs execute regular heartbeats (known informally as dirty calls) to renew their leases. (This is done automatically by the RMI implementation.)
- The server side checks periodically for expired leases.
- Once the lease expires (because no clients exist anymore to reference the object) the RMI implementation simply forgets the object. It can then be garbage-collected like any other object
To avoid this, the distributed garbage collector (RMI garbage collector) forces a major client garbage collection (with all the negative impact on performance) at regular intervals. This interval is controlled using the GCInterval system property.
The same setting exists for the server side and do the same thing. (Until Java 6, both settings defaulted to a minute. In Java 6, the server-side default changed to one hour.) It effectively triggers a major garbage collection every minute, a performance nightmare. The setting makes sense in general on the client side (to allow the server to remove remote objects), but it's unclear why it exists on the server side. A server remote object is freed for garbage collection either when the lease is up or when the client explicitly cleans it. The explicit garbage collection has no impact on this, which is why I recommend setting this property as high as possible for the server.
I also recommend that RMI be restricted to stateless service interfaces. Since there would exist only one instance of such a server interface and it would never need to be garbage-collected (or at least as long as the application is running), we do not need remote garbage collection to remove it. If we restrict RMI in this way, we can also set the client-side interval very high and effectively remove the distributed garbage collector from our equation by negating its impact on application performance.
Real Time Garbage Collectors
Real-time systems guarantee nearly instantaneous (in the single-digit-milliseconds range) execution speed for every request being processed. However, the precious time used for Garbage collection runtime suspensions can pose problems, especially since the frequency and duration of GC execution is inherently unpredictable. We can optimize for low pause time, but we can't guarantee a maximum pause time. Thankfully, there are multiple solutions to these problems.Sun originally specified the Java Real-Time System (Java RTS; see JSR-1 and JSR-282) with a specific real-time garbage collector called Henriksson GC that attempts to ensure strict thread scheduling. The algorithm is intended to make sure garbage collection does not occur while critical threads (defined by priority) are executing tasks. However, it is a best-effort algorithm, and there is no way to guarantee that no critical threads are suspended.
In addition, the Java RTS specification includes scoped and immortal memory areas. A scope is defined by marking a specific method as the start of a scoped memory area. All objects allocated during the execution of that method are considered to be part of the scoped memory area. Once the method execution has finished, and thus the scoped memory area is left, all objects allocated within it are simply considered deleted. No actual garbage collection occurs, objects allocated in a scoped memory area are freed, and all used memory is reclaimed immediately after the defined scope has been exited.
Immortal objects, objects allocated via the immortal memory area, are never garbage collected, an enormous advantage. As such, they must never reference scoped objects, which would lead to inconsistencies because the scoped object will be removed without checking for references.
These two capabilities give us a level of memory control that is otherwise not possible in Java, which allows us to minimize the unpredictable impact of the GC on our response time. The disadvantage is that this is not part of the standard JDK, so it requires a small degree of code change and an intrinsic understanding of the application at hand.
The IBM WebSphere and Oracle JRockit JVMs both provice real-time garbage collectors. IBM promotes its real-time garbage collector by guaranteeing ≤1 ms pause time. Oracle JRockit provides a deterministic garbage collector, in which the maximum GC pause can be configured. Other JVMs, such as Zing, from Azul Systems, try to solve this issue by completely removing the stop-the-world event from the garbage collector. (There are a number of Real Time Java implementations available).
No comments:
Post a Comment