Monday, October 3, 2011

JavaOne 2011: The Definitive Set of HotSpot Performance Command-line Options

Charlie Hunt (Oracle's Lead JVM Performance Engineer) presented "The Definitive Set of HotSpot Performance Command-line Options" (19820) in Yosemite A/B/C in the Hilton San Francisco on Monday morning. He introduced himself with description of his "12+ years of Java performance experience" and "20 years of (general) performance experience." He is also lead author of the just-published book Java Performance.

Hunt provided a slide categorizing command line options: data, memory, startup, latency, throughput, and "other." He stated that his presentation is structured around these groupings. He also said that there is no reason to use these options if the standard JVM configuration works satisfactorily for a given application. As the session's title explicitly states, this session is on HotSpot JVM Options.

Data

For collecting garbage collection data, Hunt prefers -XX:+PrintGCDetails. He says it is descriptive, but not intrusive. He stated that -XX:+PrintGCTimeStamps and -XX:+PrintGCDataStamps work well for offering a timestamp of a garbage collection event. Because the former is based on the beginning of the JVM and the latter is based on "wall clock," Hunt's general rule of thumb is that the longer running the application, the more likely that -XX:+PrintGCDataStamps will be preferred. -XX:+PrintReferenceGC is useful when employing reference objects. It therefore is most useful when using reference objects such as WeakReference and SoftReference.

Memory

Hunt introduced the memory command-line options with the "obvious" -Xmx and -Xms. I think few of us have the luxury of not using these two, but Hunt pointed out that some applications may not even need to set these. -XX:NewSize and -XX:MaxNewSize allow specification of the initial and maximum young generation sizes respectively. -Xmn allows specification of initial, maximum, and minimum young generation size. Hunt stated that -XX:NewRatio's biggest disadvantage is the difficulty in understanding ratios, but the advantage is knowing the ration between young generation and old generation is maintained. Hunt showed -XX:PermSize and -XX:MaxPermSize for initial and maximum perm generation sizes. -Xshare:on enables class data sharing, lowers footprint and startup time, and is available in Java 7 for -client or -server in conjunction with serial garbage collection.

Startup

Hunt discussed -client and -server for enabling the client and server JVM runtimes respectively. -XX:+TieredCompilation enables "JIT compilation policy" similar to that used for -client for rapid startup time commonly associated with -client with the advantages of running a -server JVM. It's the "best of both worlds." Hunt again referenced -Xshare:on in relation to startup performance.

Latency

Hunt showed a slide on throughput options -XX:+UseParallelOldGC/-XX:+UseParallelGC and suggested starting with these options first when tuning for latency. His recommendation is to "start with ParallelOld/Parallel GC first" and then "move to CS (or G1) if latency requirements are not met." -XX:ParallelGCThreads is used to specify number of parallel garbage collection threads to use.

-XX:+ParallelRefProcEnabled is available for all garbage collectors and "enables multithreaded reference processing," Hunt's recommendation is to "enable this option when -XX:+PrintReferenceGC output shows high Reference reclamation time." -XX:SoftRefLRUPolicyMSPerMB is an option Hunt said he has rarely seen needed in the past, but is starting to see people need. The option is useful for specifying the survival time of a soft reference "after last strong reference to the object has been collected" and smaller values mean more aggressive collection. Hunt cautioned that this last option is only useful in very specific situations.

-XX:+UseConcMarkSweepGC is option for enabling CMS ("enables a mostly concurrent old generation GC"). CMS is default garbage collector on Apple machines. -XX:+ParNewGC is automatically enabled when -XX:+UseConcMarkSweepGC is employed and thus does not need to be enabled explicitly. Hunt stated that tuning CMS is challenging when he showed the slide on -XX:CMSInitiatingOccupancyFraction and -XX:+UseCMSInitiatiingOccupancyOnly. The former is only used on the first CMS cycle while the latter specifies using it for all CMS cycles.

Hunt had a slide on -XX:+CMSClassUnloadingEnabled and -XX:+CMSPermGenSweepingEnabled. The latter is only necessary when using Java SE 6 Update 3 or earlier. Newer versions only need the first option. If the second option is used anyway in a new JDK, there is a message stating that it is not needed.

Hunt talked about -XX:+ScavengeBeforeFullGC even though it is the default setting in the HotSpot JVM and is applicable to all garbage collectors. His reasoning for including this is that he has seen many people explicitly disable this feature to not have young generation collected before old generation. Hunt talked about using -XX:+DisableExplicitGC to disable hints/calls to garbage collector via System.gc(). -XX:+ExplicitGCInvokeConcurrent and (preferred) -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses can be used when explicit garbage collection is used. -XX:ParallelCMSThreads specifies number of parallel CMS threads.

-XX:SurvivorRatio "sets the ratio of survivor space size to eden space size." -XX:TargetSurvivorRatio "sets the target survivor space occupancy to target after a minor garbage collection." Hunt says developers should error on the side of setting -XX:MaxTenuringThreshold "too high rather than too low" to "avoid a full GC." -XX:+PrintTenuringDistribution is "very useful in fine-tuning young gen's survivor space size for effective object aging."

-XX:+UseG1GC "enables the garbage first GC" and is available with later Java SE 6 releases and Java SE 7 as an experimental feature. Hunt's opinion is that it will eventually be supported in a later Java SE 7 release. Hunt "generally likes" what he is seeing using this garbage first garbage collector. -XX:MaxGCPauseMillis "sets a maximum pause time target for G1 GC."

Hunt talked about the concept of "JVM safepoint" when discussing the options -XX:+PrintGCApplicationStoppedTime and -XX:+PrintGCApplicationConcurrentTime. These two options "are useful for tracking down latency induced into the application as a result of JVM safepoint operations." -XX:+PrintSafePointStatistics provides details on safepoint events (which happened and when they happened) and produces the report when the JVM exits. Hunt generally discourages use of the CMS incremental options.

Throughput

-XX:UseAdaptiveSizePolicy and -XX:+PrintAdaptiveSizePolicy were the first two options Hunt covered in the "throughput" section. He also covered -XX:InitialSurviorRatio and -XX:TargetSurivorRatio (again).

Other

-XX::+UseCompressedOops,-XX:+UseLargePages, and -XX:LargePageSizeInBytes were covered in the "Other" section and all three options are applicable across the HotSpot garbage collectors. Large pages reduce TLB misses. -XX:+UseNUMA "enables a heap space allocation policy favorable for NUMA architectures." -XX:+AggressiveOpts "enables the most recent Java HotSpot VM optimizations" and is a supported option. Hunt recommends using it over -XX:AggressiveHeap. -XX:+UseBiasedLocking, -XX:+DoEscapeAnalysis, -XX:+AlwaysPreTouch,-XX:+PrintCommandLineFlags, and -XX:+PrintFlagsFinal were all also part of the "Other" secton. -XX:+PrintCommandLineFlags shows the ergonomically selected options for a JVM and -XX:+PrintFlagsFinal shows supported options used by default.

Conclusion

Hunt stated that his book has the "gory details" and includes the options summary in an appendix. There are many options available and some of them affect each other. In many ways, their use feels like something of a black art. I can see how a book might be useful in situations where every last ounce of performance needs to be squeezed out of the HotSpot VM.

4 comments:

Unknown said...

Excellent post on JVM command line options.

@DustinMarx said...

Thanks, Jason, for taking the time to leave the compliment.

Dustin

Unknown said...

I appreciate you taking the time to do an awesome write up :).

I think a lot of us kind of look at the JVM as a black box that magically runs our code.

Srikumar said...

A very nice article. I have been tuning production systems for an year now and its good to see some of these JVM HotSpot options, which I havent used before. Please keep sharing more such articles if you can.