Wednesday, May 17, 2017

Kotlin and Android: JetBrains and Google Behind One Language

Google I/O 2017 had several major announcements, but one of the most interesting to me is "first-class support for Kotlin" on Android.

The Kotlin blog post on this announcement discusses the benefits this brings to Kotlin users:

In case you are concerned about other platforms that Kotlin supports (Kotlin/JVM for server and desktop, Kotlin/JS and Kotlin/Native), please be sure that they are as important for us as ever. Our vision here is to make Kotlin a uniform tool for end-to-end development of various applications bridging multiple platforms with the same language. ... First-class support on Android will likely bring more users to Kotlin, and we expect the community to grow significantly. This means more libraries and tools developed in/for Kotlin, more experience shared, more Kotlin job offerings, more learning materials published, and so on. We are excited to see the Kotlin ecosystem flourish!

An interesting point related to and included in this announcement is mentioned in the Kotlin on Android FAQ:

What's the future of Kotlin? JetBrains' thoughtful work on Kotlin's design is one of the reasons we're embracing the language. Google is partnering with JetBrains to ensure a wonderful overall developer story—from language, to framework, to tools. And, we are excited to be working together to move the Kotlin language into a non profit foundation.

The Kotlin blog post describes the future of Kotlin with Android in greater detail:

We will be partnering with Google to create a non-profit foundation for Kotlin. Language development will continue to be sponsored by JetBrains, and the Kotlin team (over 40 people and second largest team at the company) will operate as usual. Andrey Breslav remains the Lead Language Designer, and Kotlin will be developed under the same principles as before.

Android 3.0 Canary 1 (preview) can currently be downloaded at https://developer.android.com/studio/preview/index.html at includes Kotlin support among its new features that include Java 8 support.

In the 2016 edition of my annual post on significant software developments of the year, I listed Kotlin-related developments in the "Honorable Mention" section. With this announcement of Kotlin being officially supported on Android and the language being placed in nonprofit foundation and enjoying support of Google and JetBrains, I think it's very possible that Kotlin will make my top ten list in 2017 for significant software language development developments. Besides these announcements, Kotlin has already seen JavaScript support released in Kotlin 1.1 in 2017 and a preview of Kotlin/Native.

Monday, May 15, 2017

JVM Statistics with jstat

I have written about several command-line tools provided with the Oracle and/or OpenJDK Java Development Kits (JDKs) in the past, but I've never written exclusively about the jstat tool. The Oracle JDK 9 Documentation Early Access states that jstat is used "to monitor Java Virtual Machine (JVM) statistics." There is also a warning, "This command is experimental and unsupported." Although I quoted the JDK 9 documentation, jstat has been a part of the Sun/Oracle JDK in some form (known at one time as jvmstat) in Java SE 8, Java SE 7, Java SE 6, and J2SE 5. Instrumentation for the HotSpot JVM was introduced with Java 1.4.1 (only enabled when -XX:+UsePerfData was set) and has provided "always-on instrumentation" since Java 1.4.2.

Much of the information that jstat provides can be gleaned from visual tools such as VisualVM, JMX and platform MBeans, garbage collection logs or via JVM options. However, jstat provides advantages when compared to each of these alternatives. Its advantages include those common to command-line tools such as the ability to execute from scripts and run without need of developers or others being present. It also is useful to be able to apply jstat to an already running Java process to start monitoring its JVM statistics rather than having the specify the monitoring of those options when starting the JVM.

For my examples in this post, I am using Oracle JDK 9 build 164. The next screen snapshot shows this version and also shows one of the first flags to apply when starting to use jstat: the -options flag.

As demonstrated in the screen snapshot, and as stated in the jstat documentation, jstat -options is used to "display the list of options for a particular platform installation." In my example being shown here, the following options are available:

  • -class
  • -compiler
  • -gc
  • -gccapacity
  • -gccause
  • -gcmetacapacity
  • -gcnew
  • -gcnewcapacity
  • -gcold
  • -gcoldcapacity
  • -gcutil
  • -printcompilation

I'll only be looking at a small subset of these available options in this post, but the jstat documentation provides a single sentence describing each jstat option and the command-line use of each option is very similar to all other options. In fact, once one learns a few minor things regarding use of jstat, the execution of various options becomes the easy part. The difficult part of using jstat is often interpreting the data provided by jstat.

The jstat -help option prints a simple usage as shown in the next screen snapshot.

From the jstat usage message, we learn that the jstat command-line tool is executed by running the name of the command first (jstat) with the hyphenated option name next, followed by the optional -t and/or -h flags, followed by a vimid, and concluding with an optional interval and optional count of the number of times to execute the command on the provided interval. Examples are clearer than descriptive text and some examples are shown in this post and in the jstat documentation.

For monitoring "local" JVM statistics, the vmid is simply the process ID of the JVM process. This is the same PID returned by the hip jcmd (or stodgy jps) for Java processes. The next screen snapshot demonstrates use of jcmd to identify the PID (8728 in this case) of the Java application I'm monitoring in my examples (JEdit in this case).

The "Virtual Machine Identifier" section of the jstat documentation provides significantly more details regarding the vmid because a more complicated vmid (for remote monitoring of JVM statistics) can include a protocol, the local targeted machine's vmid, host, and port. Although all of my examples in this post will use jstat and a simple Java PID (vmid), the jstat documentation does provide examples of use of a more detailed vmid for remote monitoring of JVM statistics.

For the remaining examples in this post, I wanted a Java application that was a bit more interesting from a JVM statistics monitoring perspective than JEdit is. I decided to use the "PigInThePython" sample application from Nikita Salnikov-Tarnovski's post "Garbage Collection: increasing the throughput" on the highly recommended Plumbr Blog. See that post if you're interested in seeing the source code for PigInThePython.

For my first example of using jstat, I use one of its most commonly used options: -gcutil. Besides demonstrating the -gcutil option, I'll use this first example to also demonstrate and explain the jstat output options that generally apply to other jstat options besides gcutil.

The following screen snapshot demonstrates use of jcmd to acquire the PID of the PigInThePython application (5096 in this case) and running of the simplest form of jstat -gcutil.

In its simplest form (with no other options), jstat -gcutil displays a single line of output with no timestamp. The column headings are described in the jstat documentation section "-gcutil option" which also describes the -gcutil option as, "Summary of garbage collection statistics." This documentation explains, for example, that several of the columns indicate percentages of usage of different spaces' allocations while other columns indicate number of garbage collection events and total garbage collection time.

We often want to associate the statistics that jstat provides with the time that other events are occurring in the monitored system to identify correlations between those events and effects on the JVM. The jstat -t option will prepend a timestamp at the beginning of the output. This timestamp is the number of seconds since the JVM that is being monitored was started. Although this is not as convenient for humans to read as other formats, it does make it possible to correlate the JVM statistics with timeframes in which the JVM has been running and with garbage collection logs that had time stamps included. The next screen snapshot demonstrates -t in action:

It is typically useful to monitor JVM statistics such as those presented by jstat -gcutil more than once. The next screen snapshot demonstrates use of a specified interval (100 milliseconds intervals as specified with 100ms to have these results captured and shown every 100 milliseconds.

The output in the last screen snapshot never repeats the header with the column acronyms after showing it the first time. If one wants to have that header repeated after a certain number of lines to make it easier to know which numbers belong with which columns much farther down in the output, the -h option can be used to specify the number of results after which the column headers are displayed again. In the next screen snapshot, -h20 is used to see the header after every 20 rows.

There may be times when it's desirable to have jstat provide its data every so often and only for a certain number of times. The interval allows one to specify the duration of time between the results and any integer following that interval specification acts as the limit on the total number of times the results will be displayed. In the next screen snapshot, the 15 at the end of the command limits the output to 15 total rows.

The jstat -gccause option returns the same information as -gcutil, but also adds information about what caused the monitored garbage collections. The following screen snapshot demonstrates this.

In the above screen snapshot, we see that the "cause of last garbage collection" (LGCC) was "G1 Humongous Allocation" and that "cause of current garbage collection" (GCC) is "No GC" (there is no garbage collection currently underway).

The next screen snapshot demonstrates use of jstat -class to see "Class loader statistics" number of classes loaded ("Loaded"), number of kilobytes loaded (first "Bytes"), number of classes unloaded ("Unloaded"), and number of kilobytes unloaded (second "Bytes"), and "time spent performing class loading and unloading operations" ("Time").

The command jstat -printcompilation indicates "Java HotSpot VM compiler method statistics" and is demonstrated in the next screen snapshot.

The -printcompilation option displays columns "Compiled" ("Number of compilation tasks performed by the most recently compiled method"), "Size" ("Number of bytes of byte code of the most recently compiled method"), "Type" ("Compilation type of the most recently compiled method"), and "Method" (name of class/method of most recently compiled method expressed in format consistent with HotSpot VM option -XX:+PrintCompilation).

The jstat -compiler command allows us to see "Java HotSpot VM Just-in-Time compiler statistics" such as number of compilation tasks performed ("Compiled"), number of failed compilation tasks ("Failed"), number of invalidated compilation tasks ("Invalid"), time spent on compilation ("Time"), type and class/method name of last failed compilation ("FailedType" and "FailedMethod"). This is demonstrated in the next screen snapshot.

There are several more options available with jstat and most of them are specific to different perspectives on garbage collection in the monitored JVM.

The jstat documentation warns that the tool is experimental and subject to change or removal in future versions of the JDK. The documentation also warns against writing scripts and tools that parse the output of jstat as the output content or format might change in the future. However, I can see how some might take the risk and write parsing code because this command-line tool is easily accessible from scripts and the parsing code needed would not be very complex.

This post has been an introduction to jstat, but there is much more to learn about the tool. Interpretation of the tool's results is more complex than using the tool and the analysis of the data provided by jstat is likely to be more challenging than the effort to collect the numbers. Several additional resources are listed below to provide more information on the collecting of and analyzing of Java Virtual Machine statistics with the jstat tool.

Additional jstat Resources

Wednesday, May 10, 2017

Java Platform Module System Public Review Fails to Pass

There has been an unusual level of drama, intrigue, and politics in the world of Java over the past few weeks that culminated in this week's JSR 376 Java Platform Module System Public Review Ballot. Java modularity [including the Java Platform Module System (JPMS)] has been arguably the most significant piece of JDK 9 and so it's not surprising that it has received so much attention. In addition to the typical publicly available mailing list traffic, there have been blog posts and open letters to further advertise the contention and debate surrounding JPMS (JSR 376), described as "a central component of Project Jigsaw."

The final vote was, according to the JSR #376 Java Platform Module System Public Review Ballot page, 10 votes for and 13 votes against, so "the EC has not approved this ballot." The comments that go along with the votes in the text area at the bottom of the ballot page are telling. In particular, I thought it interesting how many reviewers vote no mainly because they were uncomfortable with the other, more vocal reviewers not approving. It's also interesting that the Public Review Ballot for JSR #379 Java SE 9 Release Contents (an "umbrella" JSR) passed overwhelmingly on the same day this one failed.

It will be interesting to see how this continues to unfold over the coming days and weeks and what impact is has on the release date for JDK 9. Rather than rehash the arguments on both sides, I reference posts from key contributors to the discussion below.

References: Executive Committee Participants/Representatives

References: Opinions/Forums

References: Other Overviews

References: Since the Vote

Monday, May 8, 2017

Java's Observer and Observable are Deprecated in JDK 9

In the blog post Applying JDK 9 @Deprecated Enhancements, I discussed additions of the optional elements (methods) forRemoval() and since() to the @Deprecated annotation in JDK 9. I stated in that post, "The application of new JDK 9 @Deprecated methods on the Java SE API can also be instructive in how they are intended to be used." In this post, I look at the application of the enhanced @Deprecated annotation to the JDK class java.util.Observable.

The class java.util.Observable has been with us almost since the beginning (since Java 1.0). As of JDK 9, however, it will be marked as deprecated. The following screen snapshot shows a portion of this class's Javadoc representation in a web browser.

This is an example of a class that in the category "Deprecated With No Plans for Removal" described in my previous post. The presence of since() provides information on when it was deprecated (JDK 9) and the absence of forRemoval() indicates lack of concrete plans to actually remove the class. The java.util.Observer interface has also been deprecated in a similar manner and its documentation references the documentation for the Observable class.

Not only does the Observable documentation relay when it was deprecated, but it also documents the problems with Observable that make deprecation desirable and provides important information on alternatives that might be used instead of Observable:

This class and the Observer interface have been deprecated. The event model supported by Observer and Observable is quite limited, the order of notifications delivered by Observable is unspecified, and state changes are not in one-for-one correspondence with notifications. For a richer event model, consider using the java.beans package. For reliable and ordered messaging among threads, consider using one of the concurrent data structures in the java.util.concurrent package. For reactive streams style programming, see the Flow API.

This is a good example of how Java developers can use the Javadoc tag @deprecated to provide much deeper detail related to deprecation than can be provided even with the enhanced @Deprecated annotation. JEP 277 ("Enhanced Deprecation") explicitly listed unifying of Javadoc tag @deprecated and annotation @Deprecated as a "non-goal": "It is not a goal of this project to unify the @deprecated Javadoc tag with the @Deprecated annotation."

Additional details justifying the deprecation of Observable and Observer can be found in JDK-8154801 ("deprecate Observer and Observable"). There is a quote in there from Josh Bloch as part of JDK-4180466 ("Why is java.util.Observable class not serializable.") dated February 1999:

This class is no longer under active development. It is largely unused in the JDK, and has, for the most part, been superseded by the 1.1 Beans/AWT event model. ... Observable has fallen into disuse and is no longer under active development.

For the most part, it seems that Observer and Observable are not used much, so deprecation shouldn't be much of an issue, especially given no definitive plans to remove these altogether.

Monday, April 24, 2017

Java and Docker: Now and the Future

Sharat Chander's blog post Official Docker Image for Oracle Java and the OpenJDK Roadmap for Containers provides a high-level overview of the "Official Docker Image for Oracle Java," an introduction to Docker and why containers like Docker are desirable, and a peek at things to come for Java on Docker. The post also provides a link to the "Docker image for Oracle Server JRE" that "is now available on Docker Store."

Chander briefly discusses attempts to set "up a consistent, reproducible environment that scales to thousands/millions of instances" for the cloud and how operating system tools and hardware virtualization have been used in this way. He then introduces Docker as another approach and provides a brief description of the benefits of Docker.

Chander's post introduces Alpine Linux and quotes their web page, "Small. Simple. Secure. Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox." Chander associates Alpine Linux's use of musl libc (described as "a new general-purpose implementation of the C library" that "is lightweight, fast, simple, free and aims to be correct in the sense of standards-conformance and safety") and busybox (combination of "tiny versions of many common UNIX utilities into a single small executable") to Java and paraphrases the message list post announcing Project Portola: "The goal of OpenJDK 'Project Portola' is to provide a port of the JDK to Alpine Linux, and in-particular the 'musl' C library."

The FAQ section of Chander's post has some interesting questions and answers. For example, there is information on Oracle's "recommendations for running Oracle Java SE on Docker" available on GitHub. There is information on Java 8 Update 131 enhancements that enable "better memory and processor integration between Java and Docker" and it's interesting to note that in "JDK 9 the JVM will be container-aware" instead of "the thread and memory settings [coming] from the host OS."

The OpenJDK Docker Repository is available at https://hub.docker.com/_/openjdk/.

It's interesting to see what is being done and what's planned for using Java with Docker.

Tuesday, April 18, 2017

Oracle JDK 9 Early Access Documentation Updated

Raymond Gallardo's 4 April 2017 post Early Access documentation for Oracle JDK 9 has been updated today announces updates to the Oracle JDK9 Documentation Early Access page. Gallardo highlights a few of the updated sections including What's New in Oracle JDK 9, Oracle JDK 9 Migration Guide, HotSpot Virtual Machine Garbage Collection Tuning Guide (including Garbage-First Garbage Collector Tuning), javapackager tool for "packaging Java and JavaFX applications," and XML Catalog API.

The Main Tools to Create and Build Applications section features information on jlink (JEP 282), jmod (create and show contents of Project Jigsaw JMOD files), and jdeprscan (static analysis tool that scans ... for uses of deprecated API elements). There is also information on jhsdb for obtaining "specific information from a hanging or crashed JVM."

I summarized some of the tools and features being removed from the JDK with Java 9 in the post JDK 9 is the End of the Road for Some Features. Some of these are spelled out with additional details in the section Removed Tools. The Changes to Garbage Collection section similarly shows changes and removals in Java 9 of options and commands related to garbage collection. The section Changes to the Installed JDK/JRE Image outlines removal of things such as rt.jar and tools.jar, the extension mechanism, and the endorsed standards override mechanism.

The section Enable Logging with the JVM Unified Logging Framework demonstrates how to "use -Xlog option to configure or enable logging with the Java Virtual Machine (JVM) unified logging framework." The Changes to GC Log Output section explains how to use this -Xlog with gc (-Xlog:gc) for logging related to garbage collection and points out that "the -XX:+PrintGCDetails and -XX:+PrintGC options have been deprecated."

I previously blogged on the future of the Concurrent-Mark-Sweep garbage collector. The Java SE 9 HotSpot Virtual Machine Garbage Collection Tuning Guide states, "The CMS collector is deprecated. Strongly consider using the Garbage-First collector instead."

The main Oracle JDK 9 Documentation Early Access page is at http://docs.oracle.com/javase/9/index.html and its Developer Guides link provides direct references to guides such as Troubleshooting Guide, HotSpot Virtual Machine Garbage Collection Tuning Guide, Java Scripting Programming Guide, Java Language Updates, and Migration Guide.

Monday, April 17, 2017

Implications of the Presence of StringBuffer

When I am working on legacy code and run across instances of StringBuffer, I typically replace them with instances of StringBuilder. Although a performance advantage can be gained from this change, I often change it in places I know will have little noticeable effect in terms of performance. I feel it's worth making the change for a variety of reasons in addition to the potential for performance benefit. There's rarely a reason to not choose StringBuilder over StringBuffer (API expectations are the most common exception) and the existence of StringBuffer in code misleads and provides a bad example to those new to Java.

In the book The Pragmatic Programmer: From Journeyman to Master, Andy Hunt and David Thomas discuss "the importance of fixing the small problems in your code, the 'broken windows'." Jeff Atwood touched on this subject in the post The Broken Window Theory and it has been more recently addressed in the posts Software Rot, Entropy and the Broken Window Theory and Don't leave broken windows. The presence of StringBuffer implies a staleness in the code. In effect, use of StringBuffer may not be a "broken window," but it's a really old, leaky single-pane window that should be replaced with a modern, energy-efficient double-pane window.

I found Peter Lawrey's recent blog post StringBuffer, and how hard it is to get rid of legacy code to be an interesting take on other implications of StringBuffer that still exist in code. Lawrey quotes the last paragraph of the StringBuffer class Javadoc documentation, "As of release JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder. The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization." Lawrey then uses simple Java methods and jmap to demonstrate that instances of StringBuffer are still used in classes and libraries delivered with the JDK even as late as Java 8.

Lawrey points out that the presence of StringBuffer in frequently used Java code more than a decade after the introduction of "drop-in replacement" StringBuilder is evidence of how difficult it is to "clean up legacy code." Lawrey's full conclusion states, "Using StringBuffer on startup doesn’t make much difference, but given it has a well known, drop in replacement, and it is still used, even in new functionality more than ten years later shows how hard it can be to clean up legacy code or to change thinking to get people to use best practice libraries."

I decided to try out one of Lawrey's simplest examples when compiled with Java 8 Update 121 and when compiled with a recent release of OpenJDK 9. I (slightly) adapted Lawrey's example to the simple "Main" class listing shown next.

Main.java

import java.io.IOException;

/**
 * (Slightly) adapted class from blog post
 * "StringBuffer, and how hard it is to get rid of legacy code" at
 * https://vanilla-java.github.io/2017/04/13/String-Buffer-and-how-hard-it-is-to-get-rid-of-legacy-code.html
 */
public class Main
{
   /**
    * Main function that instantiates this Java "application" and does nothing
    * else until "ENTER" is pressed.
    */
   public static void main(final String[] args) throws IOException
   {
      System.out.println("Waiting [press ENTER to exit] ..");
      System.in.read();
   }
}

The following screen snapshot shows the output of using jcmd with its -all option (includes unreachable objects in the inspection) to show the number of instances of StringBuffer and StringBuilder in the simple Java application when compiled and run against three different versions of Java (Java 8 Update 102, Java 8 Update 121, and OpenJDK 9.0 ea+164). The execution of jcmd is performed in PowerShell and so Select-String is used similarly to how grep is used in Linux.

Although the versions of the class compiled and executed with versions of Java 8 had instances of StringBuffer, the version compiled with and executed against Java 9 only had instances of StringBuilder. It looks like the resolution of JDK-8041679 ("Replace uses of StringBuffer with StringBuilder within core library classes") and JDK-8043342 ("Replace uses of StringBuffer with StringBuilder within crypto code") have had their intended effect.

Wednesday, April 12, 2017

Use Cases for Java Enhanced Enums

In the message Enhanced Enums -- use cases, Brian Goetz writes, "We're hoping to get user feedback on the feature [Enhanced Enums] as it is now implemented." He states the first purpose of his message, "To get things started, here are some typical use cases where generic enums might be useful." The first of the two presented examples is refactoring com.sun.tools.javac.code.Dynamic class and its eight factory methods returning different instances of BootstrapArgument with different instances of its nested Kind enum into a single method using the dynamic enum.

The second use case example of a possible application of enhanced enums that Goetz provides is command line parsing in which an enum is used to represent the data types of parameters. Vicente Romero replied to Goetz's message with two more examples of where enhanced enums might be applied: "code sharing between enum constants" and "the power of sharper typing".

Goetz encourages others to provide more use cases for enhanced enums, "Please contribute others, as well as places in the JDK where code could be refactored using this feature." He concludes, "If anyone wants to experiment and offer their experience in applying (or misapplying) this feature, either to the JDK or their own codebase, that would be appreciated...."

Tuesday, April 11, 2017

Java Finalizer and Java File Input/Output Streams

I often find myself noticing topics online more after I've worked directly with them or spent time learning about them. The recent Stephen Connolly (CloudBees) post FileInputStream / FileOutputStream Considered Harmful caught my attention because of my recent issues with Java's finalizer. In that post, the author talks about potential consequences of java.io.FileInputStream and java.io.FileOutputStream implementing overridden finalize() methods FileInputStream.finalize() and FileOutputStream.finalize(). With talk of deprecating the finalizer in JDK 9, my perspective is that a subject I had not thought about in years is all of a sudden all around me.

Connolly's post references the Hadoop JIRA HDFS-8562 ("HDFS Performance is impacted by FileInputStream Finalizer"). That JIRA was opened in June 2015 and its description includes interesting background on why the finalizer of FileInputStream causes issues for those using HDFS. This JIRA is also interesting because it looks at why it's not trivial to change FileInputStream and FileOutputStream to not use the protected finalize() methods.

JDK-8080225 ("FileInputStream cleanup should be improved.") is referenced in HDFS-8562 and was written in May 2015. It states, "FileInputStream relies on finalization to perform final closes if the FIS is not already closed. This results in extra work for GC that occurs in a burst. The cleanup of FileInputStreams should happen sooner and not contribute to overhead in GC." Alan Bateman has commented on this with a work-around, "The issue can be easily worked around by using Files.newInputStream instead." Roger Riggs writes of the difficulty of adequately addressing this issue, "Since it is unknown/unknowable how many FIS/FOS subclasses might rely on overriding close or finalize the compatibility issue is severe. Only a long term (multiple release) restriction to deprecate or invalidate overriding would have possibility of eventually eliminating the compatibility problem."

Connolly ends his post with reference to Jenkins changing this via JENKINS-42934 ("Avoid using new FileInputStream / new FileOutputStream"). An example of changing new FileInputStream to Files.newInputStream is available from there.

The fact that I've been able to use Java for so many years without worrying about the finalizer even while I used classes such as FileInputStream is evidence that, by themselves, limited use of these classes with finalize() implementations doesn't necessarily lead to garbage collection or other problems. I like how Colin P. McCabe articulates the issue in the HDFS JIRA on this: "While it's true that we use FileInputStream / FileOutputStream in many places, most of those places have short-lived objects or only use very small numbers of objects. Like I mentioned earlier, the big problem with finalizers we encounter is in the short-circuit read stream cache. If we can fix that, as this patch attempts to do, we will have eliminated most of the problem." In other words, not all uses of FileInputStream and FileOutputStream are causes for concern. Using tools to identify unusually high garbage collection related to finalizers is the best way to identify those that need to be addressed.

For many years of Java development, I did not use or case about the Java finalizer. In recent months, it has become an issue that I'm seeing more people dealing with. Deprecating the Java finalizer is a good first step toward removing it from core APIs.

Monday, April 10, 2017

Java Garbage Collectors: When Will G1GC Force CMS Out?

In JEPs proposed to target JDK 9 (2017/4/4), Mark Reinhold has written that JEP 291 ("Deprecate the Concurrent Mark Sweep (CMS) Garbage Collector") is one of two JEPs that "have been placed into the 'Proposed to Target' state by their owners after discussion and review". If things go well for JEP 291, it will be targeted for JDK 9.

Reinhold explains in this message why JEP 291 can still be targeted to JDK 9 at this relatively late date: "JEP 291 requires only a minuscule code change, to enable the proposed warning message to be issued. It's a JEP in the first place not because it's a risky change but to bring visibility to the plan to remove the CMS collector in the long term." As these sentences state, the JDK 9 targeted action is simply to mark the Concurrent Mark Sweep (CMS) collector as deprecated with the idea that it will be removed at some point "in the long term."

Although G1GC is the default garbage collector for JDK 9 through JEP 248, it is not always the best collector for all situations. Even the proposal to deprecate CMS acknowledges this in its "Risks and Assumptions" which states, "For some applications CMS is a very good fit and might always outperform G1."

Another recent discussion of OpenJDK jdk9-dev mailing list is titled "JEP 291: Deprecate the Concurrent Mark Sweep (CMS) Garbage Collector" contains interesting arguments for retaining CMS. Christoph Engelbert (Hazelcast) writes, "CMS+ParNew is the most commonly deployed solution and a lot of applications are optimized to the behavior of CMS." Scott Palmer writes that in his specific application, "we have found that so far the CMS collector has significantly lower maximum pause times than G1." Roman Kennke (RedHat) adds, "I'd say it's too early to talk about removing CMS. And, to be honest, I even question the move to deprecate it." Martijn Verburg (jClarity) states, "We are now constantly asked to tune G1 for customers and have found that even with our most advanced analytics (in combination with some of the common and more esoteric tuning options), we are unable to get G1 to outperform CMS for *certain* use cases. Several customers have therefore reverted to CMS and are very interested in its future (as consumers)."

This same discussion also includes reasons for deprecating CMS. Mark Reinhold's post states that JEP 291 was "posted last summer" and requests were made for a CMS maintainer, but "so far, no one has stepped up." He concludes that post, "In any case, Oracle does intend to stop maintaining CMS at some point in the not-to-distant future, and if no one ever steps up then we'll remove the code."

Jeremy Manson (Google) explains the trickiness of the current situation with G1GC and CMS:

We decided that supporting CMS in any sort of ongoing fashion should be a last resort after we try getting G1 to do what we need it to do. Our belief is that fewer collectors is better. We spent some time over the last few months coordinating with some of the folks at Oracle and experimenting to see if there were plausible ways forward with G1. We couldn't find anything obvious.

The gist of all this seems to be that many applications still depend on CMS and these applications will have a deprecation warning displayed in JDK 9. The future of the CMS garbage collector appears to be in doubt, but it would only be deprecated in JDK 9. When the CMS collector would actually be removed seems less obvious, but I assume that JDK 10 is a potential "future major release" in which CMS support could be terminated. Quoting Manson (Google) again, "The short of it is: We are still willing to contribute work to support CMS, but we want to make sure we've done our due diligence on G1 first. Our belief has been that the JDK 10 timeframe is long enough that we don't have to make this decision hurriedly."

It looks likely that Java applications using the Concurrent Mark Sweep garbage collector in JDK9 will see warning messages about the deprecation of the CMS garbage collector. When (or if) CMS won't be available at all is less obvious and depends on who is willing to continue supporting CMS.

Saturday, April 1, 2017

How to Effectively Sweep Problems Under the Rug in Java

Because software bugs can make us appear bad as developers and lead to others thinking less of us, it is best to avoid writing bugs, to identify and fix bugs quickly, or to cover up our bugs. There are numerous blog posts and articles that discuss avoiding bugs and identifying and fixing bugs, so, in this blog post, I look at some of the most effective tactics for sweeping problems in Java code bases under the rug.

Swallow Checked Exceptions

Exceptions are one of the things that give us away when we've accidentally introduced bugs into the code. It's also a hassle to declare a throws clause on a method or catch a checked Exception. A solution to both problems is to simply catch the exception (even if it's a RuntimeException) at the point it might be thrown and do nothing. That keeps the API concise and there is little one could do about the checked exception anyway. By not logging it or doing anything about it, no one even needs to know it ever happened.

Comment Out or Remove Unhappy Unit Tests

Failed unit tests can be distracting and make it difficult to determine when new functionality has broken the tests. They can also reveal when we've broken something with code changes. Commenting out these failing unit tests will make the unit test report cleaner and make it easier to see if anyone else's new functionality breaks the unit tests.

Use @Ignore on JUnit-based Unit Tests

It may seem distasteful to comment out failing unit tests, so another and possibly more pleasing alternative is to annotate failing JUnit-based unit test methods with the @Ignore annotation.

Remove Individual Tests Altogether

If commenting out a broken test or annotating a broken test with @Ignore are unsatisfactory because someone might still detect we've broken a unit test, we can simply remove the offending unit test altogether.

Comment Out the Offending Assertion

We don't necessarily need to comment out or remove entire tests. It can be as simple as commenting out or removing the assert statements in a unit test method. The method can execute and run successfully every time because no assertions means no way to fail. This is especially effective when the unit test methods are very long and convoluted so that lack of assertions is not easily spotted.

Distract with the Noise of Useless and Redundant Tests

Commenting out unit tests, annotating JUnit-based unit tests with @Ignore, and even removing unit tests might be too obvious of ploys for sweeping issues under the rug in Java. To make these less obvious, another effective tactic is to write numerous unnecessary and redundant test methods in the same unit test class so that it appears that comprehensive testing is being done, but in reality only a small subset of functionality (the subset we know is working) is being tested.

Write Unit Tests That 'Prove' Your Code is Correct Even When It Is Not

We can take advantage of the fact that unit tests can only test what the author of the unit test thinks is the expected behavior of the software under test to write unit tests that demonstrate our code to be correct. If our code for adding two integers together accidentally returns a sum of 5 when 2 and 2 are provided, we can simply set the expected result in the unit test to also be 5. A pretty unit test report is presented and no has to be the wiser.

Avoid Logging Details

Logs can expose one's software problems and an effective approach to dealing with this risk is to not log at all, only log routine operations and results, and leave details (especially context) out of logged messages. Excessive logging of mundane details can also obscure any more meaningful messages that might reveal our code's weaknesses.

Avoid Descriptive toString() Implementations

A descriptive toString() method can reveal far too much about the state of any given instance and reveal our mistakes. Not overriding Object.toString() can make it more difficult to identify issues and associate issues with any given code or developer. The extra time required to track down issues gives you more time to move onto the next project before it is discovered that it's your code that's at fault. If you write a Java class that extends a class with a descriptive toString() you can override that method in your extended class to do nothing (effectively removing the potentially incriminating toString() output). If you want it to appear as if it was never implemented at all in the inheritance hierarchy, be sure to have your extended class's toString() method return System.identityHashCode(this).

Don't Let NullPointerExceptions Betray You

The NullPointerException is probably the most common exception a Java developer deals with. These are especially dangerous because they often reveal code's weak spots. One tactic to to simply wrap every line of code with a try-catch and simply swallow the exception (including the NPE). Another and less obvious tactic is to avoid NPEs by never returning or passing a null. Sometimes there are obvious defaults that make sense to use instead of null (such as empty Strings or collections), but sometimes we have to be more creative to avoid null. This is where it can be useful to use a "default" non-null value in place of null. There are two schools of thought on how to approach this arbitrary non-null default. One approach is to use the most commonly seen value in the set of data as the default because, if it's common anyway, it may not be noticed when a few more of that value show up and you are more likely to have code that appears to process that common value without incident. On the other hand, if you have a value that is almost never used, that can make a good default because there may be less code (especially well tested code) affected by it than by the commonly expected value.

Conclusion

As I reviewed these tactics to sweep issues in Java code under the rug, I noticed some recurring themes. Exceptions, logging, and unit tests are particularly troublesome in terms of exposing our software's weaknesses and therefore it's not surprising that most of the ways of effectively "covering our tracks" relate to handling of exceptions, logging, and unit tests.

 

 

 

Happy April Fools' Day!

Saturday, March 25, 2017

Using Groovy to Quickly Analyze Terracotta HealthCheck Properties

One of the considerations when configuring Terracotta servers with tc-config.xml is the specification of health check properties between Terracotta servers (L2-L2), from clients to server (L1-L2), and from server to client (L2-L1). Terracotta checks the combination of these properties' configurations in high-availability scenarios to ensure that these combinations fall in certain ranges. This blog post demonstrates using Groovy to parse and analyze a given tc-config.xml file to determine whether Terracotta will provide a WARN-level message regarding these properties' configurations.

The "About HealthChecker" section of the Terracotta 4.3.2 BigMemory Max High-Availability Guide (PDF) describes the purpose of the HealthChecker: "HealthChecker is a connection monitor similar to TCP keep-alive. HealthChecker functions between Terracotta server instances (in High Availability environments), and between Terracotta sever instances and clients. Using HealthChecker, Terracotta nodes can determine if peer nodes are reachable, up, or in a GC operation. If a peer node is unreachable or down, a Terracotta node using HealthChecker can take corrective action."

The Terracotta 4.3.2 BigMemory Max High-Availability Guide includes a table under the section HealthChecker Properties that articulates the Terracotta properties that go into the calculations used to determine if warnings about misconfigured high availability should be logged. There are similarly named properties specified for each of the combinations (l2.healthcheck.l1.* properties for server-to-clients [L2L1], l2.healthcheck.l2.* for server-to-server [L2L2], and l1.healthcheck.l2.* for clients-to-server [L1L2]) and the properties significant to the high availability configuration checks (the * portion of the properties names just referenced) are ping.enabled, ping.idletime, ping.interval, ping.probes, socketConnect, socketConnectCount, and socketConnectTimeout. This post's associated Groovy script assumes that one has the ping.enabled and socketConnect properties for L2-L2, L1-L2, and L2-L1 all configured to true (which is the default for both properties for all L2L2, L1L2, L2L1 combinations).

The Terracotta class com.tc.l2.ha.HASettingsChecker detects two combinations of these properties that lead to WARN-level log messages starting with the phrase, "High Availability Not Configured Properly: ...". The two warning messages specifically state, "High Availability Not Configured Properly: L1L2HealthCheck should be less than L2-L2HealthCheck + ElectionTime + ClientReconnectWindow" and "High Availability Not Configured Properly: L1L2HealthCheck should be more than L2-L2HealthCheck + ElectionTime".

The Terracotta class HASettingsChecker implements the formula outlined in the "Calculating HealthChecker Maximum" section of the High Availability Guide in its method interNodeHealthCheckTime(int,int,int,int,int):

    pingIdleTime + ((socketConnectCount) * (pingInterval * pingProbes + socketConnectTimeout * pingInterval))

The following Groovy script parses an indicated tc-config.xml file and applies the same health check properties check to the relevant properties defined in that file's <tc-properties> section. The Groovy script shown here has no external dependencies other than a valid tc-config.xml file to be parsed and analyzed. The script would be shorter and require less future maintenance if it accessed the String constants defined in com.tc.properties.TCPropertiesConsts instead of defining its own hard-coded versions of these.

checkTCServerProperties.groovy

#!/usr/bin/env groovy

def cli = new CliBuilder(
   usage: 'checkTCServerProperties -f <pathToTcConfigXmlFile> [-v] [-h]',
   header: '\nAvailable options (use -h for help):\n',
   footer: '\nParses referenced tc-config.xml file and analyzes its health check parameters..\n')
import org.apache.commons.cli.Option
cli.with
{
   h(longOpt: 'help', 'Usage Information', required: false)
   f(longOpt: 'file', 'Path to tc-config.xml File', args: 1, required: true)
   v(longOpt: 'verbose', 'Specifies verbose output', args: 0, required: false)
}
def opt = cli.parse(args)

if (!opt) return
if (opt.h) cli.usage()

String tcConfigFileName = opt.f
boolean verbose = opt.v

println "Checking ${tcConfigFileName}'s properties..."
def tcConfigXml = new XmlSlurper().parse(tcConfigFileName)
TreeMap<String, String> properties = new TreeSet<>()
tcConfigXml."tc-properties".property.each
{ tcProperty ->
   String tcPropertyName = tcProperty.@name
   String tcPropertyValue = tcProperty.@value
   properties.put(tcPropertyName, tcPropertyValue)
}
if (verbose)
{
   properties.each
   { propertyName, propertyValue ->
      println "${propertyName}: ${propertyValue}"
   }
}

boolean isL2L1PingEnabled = extractBoolean(properties, "l2.healthcheck.l1.ping.enabled")
boolean isL2L2PingEnabled = extractBoolean(properties, "l2.healthcheck.l2.ping.enabled")
boolean isL1L2PingEnabled = extractBoolean(properties, "l1.healthcheck.l2.ping.enabled")
boolean isPingEnabled = isL2L1PingEnabled && isL2L2PingEnabled && isL1L2PingEnabled
println "Health Check Ping ${isPingEnabled ? 'IS' : 'is NOT'} enabled."
if (!isPingEnabled)
{
   System.exit(-1)
}

Long pingIdleTimeL2L1 = extractLong(properties, "l2.healthcheck.l1.ping.idletime")
Long pingIdleTimeL2L2 = extractLong(properties, "l2.healthcheck.l2.ping.idletime")
Long pingIdleTimeL1L2 = extractLong(properties, "l1.healthcheck.l2.ping.idletime")

Long pingIntervalL2L1 = extractLong(properties, "l2.healthcheck.l1.ping.interval")
Long pingIntervalL2L2 = extractLong(properties, "l2.healthcheck.l2.ping.interval")
Long pingIntervalL1L2 = extractLong(properties, "l1.healthcheck.l2.ping.interval")

Long pingProbesL2L1 = extractLong(properties, "l2.healthcheck.l1.ping.probes")
Long pingProbesL2L2 = extractLong(properties, "l2.healthcheck.l2.ping.probes")
Long pingProbesL1L2 = extractLong(properties, "l1.healthcheck.l2.ping.probes")

boolean socketConnectL2L1 = extractBoolean(properties, "l2.healthcheck.l1.socketConnect")
boolean socketConnectL2L2 = extractBoolean(properties, "l2.healthcheck.l2.socketConnect")
boolean socketConnectL1L2 = extractBoolean(properties, "l1.healthcheck.l2.socketConnect")

if (!socketConnectL2L1 || !socketConnectL2L2 || !socketConnectL1L2)
{
   println "Socket connect is disabled."
   System.exit(-2)
}

Long socketConnectTimeoutL2L1 = extractLong(properties, "l2.healthcheck.l1.socketConnectTimeout")
Long socketConnectTimeoutL2L2 = extractLong(properties, "l2.healthcheck.l2.socketConnectTimeout")
Long socketConnectTimeoutL1L2 = extractLong(properties, "l1.healthcheck.l2.socketConnectTimeout")

Long socketConnectCountL2L1 = extractLong(properties, "l2.healthcheck.l1.socketConnectCount")
Long socketConnectCountL2L2 = extractLong(properties, "l2.healthcheck.l2.socketConnectCount")
Long socketConnectCountL1L2 = extractLong(properties, "l1.healthcheck.l2.socketConnectCount")

Long maximumL2L1 = calculateMaximumTime(pingIdleTimeL2L1, pingIntervalL2L1, pingProbesL2L1, socketConnectCountL2L1, socketConnectTimeoutL2L1)
Long maximumL2L2 = calculateMaximumTime(pingIdleTimeL2L2, pingIntervalL2L2, pingProbesL2L2, socketConnectCountL2L2, socketConnectTimeoutL2L2)
Long maximumL1L2 = calculateMaximumTime(pingIdleTimeL1L2, pingIntervalL1L2, pingProbesL1L2, socketConnectCountL1L2, socketConnectTimeoutL1L2)

if (verbose)
{
   println "L2-L1 Maximum Time: ${maximumL2L1}"
   println "L2-L2 Maximum Time: ${maximumL2L2}"
   println "L1-L2 Maximum Time: ${maximumL1L2}"
}

long electionTime = 5000
long clientReconnectWindow = 120000

long maximumL2L2Election = maximumL2L2 + electionTime
long maximumL2L2ElectionReconnect = maximumL2L2Election + clientReconnectWindow

if (verbose)
{
   println "L2-L2 Maximum Time + ElectionTime: ${maximumL2L2Election}"
   println "L2-L2 Maximum Time + ElectionTime + Client Reconnect Window: ${maximumL2L2ElectionReconnect}"   
}

if (maximumL1L2 < maximumL2L2Election)
{
   print "WARNING: Will lead to 'High Availability Not Configured Properly: L1L2HealthCheck should be more than L2-L2HealthCheck + ElectionTime' "
   println "because ${maximumL1L2} < ${maximumL2L2Election}."
}
else if (maximumL1L2 > maximumL2L2ElectionReconnect)
{
   print "WARNING: Will lead to 'High Availability Not Configured Properly: L1L2HealthCheck should be less than L2-L2HealthCheck + ElectionTime + ClientReconnectWindow' "
   println "because ${maximumL1L2} > ${maximumL2L2ElectionReconnect}."
}

/**
 * Extract a Boolean value for the provided property name from the provided
 * properties.
 *
 * @return Boolean value associated with the provided property name.
 */
boolean extractBoolean(TreeMap<String, String> properties, String propertyName)
{
   return  properties != null && properties.containsKey(propertyName)
         ? Boolean.valueOf(properties.get(propertyName))
         : false
}

/**
 * Extract a Long value for the provided property name from the provided
 * properties.
 *
 * @return Long value associated with the provided property name.
 */
Long extractLong(TreeMap<String, String> properties, String propertyName)
{
   return  properties != null && properties.containsKey(propertyName)
         ? Long.valueOf(properties.get(propertyName))
         : 0
}

/**
 * Provides the maximum time as calculated using the following formula:
 *
 * Maximum Time =
 *      (ping.idletime) + socketConnectCount *
 *      [(ping.interval * ping.probes) + (socketConnectTimeout * ping.interval)]
 */
Long calculateMaximumTime(Long pingIdleTime, Long pingInterval, Long pingProbes,
   Long socketConnectCount, Long socketConnectTimeout)
{
   return pingIdleTime + socketConnectCount * pingInterval * (pingProbes + socketConnectTimeout)
}

This script will also be available on GitHub. At some point, I may address some of its weaknesses and limitations in that GitHub version. Specifically, as shown above, this script currently assumes the default values for "election time" and "client reconnect window", but these could be parsed from the tc-config.xml file.

The following screen snapshots demonstrate this script in action against various tc-config.xml files. The first image depicts the script's behavior when ping is not enabled. The second image depicts the script's behavior when socket checking is not enabled. The third and fourth images depict the two warnings one might encounter when properties for high availability configuration are not configured properly. The fifth image depicts a fully successful execution of the script that indicates a configuration of health check properties that are in the expected ranges.

Ping Not Enabled (not default)

Socket Not Enabled (not default)

HealthCheck Properties Warning #1

HealthCheck Properties Warning #2

HealthCheck Properties Enabled and Configured Properly

I have used a simple spreadsheet to perform these calculations and that works fairly well. However, the Groovy script discussed in this post allows for automatic parsing of a candidate tc-config.xml file rather than needing to copy and paste values into the spreadsheet. The Groovy script could be adapted to use Terracotta provided Java files as discussed earlier. There are also several other enhancements that could make the script more useful such as parsing the client reconnect window and election time from the tc-config.xml file rather than assuming the default values.

Tuesday, March 21, 2017

Project Amber: Smaller, Productivity-Oriented Java Language Features

Brian Goetz's recent message Welcome to Amber! introduces Project Amber (part of OpenJDK and proposed originally in January). Goetz opens the message with the introduction, "Welcome to Project Amber, our incubation ground for selected productivity-oriented Java language JEPs." Goetz reiterates that Project Amber is not for discussing ideas for arbitrary potential new language features, but rather is for collecting new language features for which a JDK Enhancement Proposal (JEP) already exists ("let's keep the focus on the specific features that have been adopted").

Three JEPs are already associated with Project Amber: JEP 286 ("Local-Variable Type Inference"), JEP 301 ("Enhanced Enums"), and JEP 302 ("Lambda Leftovers"). Goetz also writes that "the 'data classes' and 'pattern matching' features, already discussed publicly are intended to be adopted by Amber when we're ready to propose JEPs on them."

Work on Project Amber will proceed on the Amber repository that is "based on the jdk10 repo."

I was enthusiastic about the announcement of Project Coin with JDK 7 and have really enjoyed using its features. I feel a similar excitement about Project Amber and look forward to using its features on a regular basis. Nicolai Parlog has written that Project Amber Will Revolutionize Java.

Tuesday, March 14, 2017

Deprecating Java's Finalizer

JDK-8165641 ("Deprecate Object.finalize") has been opened to "deprecate Object.finalize()" because "finalizers are inherently problematic and their use can lead to performance issues, deadlocks, hangs, and other problematic behavior" and because "the timing of finalization is unpredictable with no guarantee that a finalizer will be called." I recently experienced and wrote about some of these nasty consequences of using Object.finalize() in the post Java's Finalizer is Still There.

In the message RFR 9: 8165641 : Deprecate Object.finalize, Roger Riggs invites review and comment on the changes associated with this issue [150 new lines that include the addition of @Deprecated to java.lang.Object.finalize() and numerous additions of @SuppressWarnings("deprecation") annotations on current JDK classes' implementations of Object.finalize() methods].

The proposed addition of Javadoc @deprecated-associated text for the Object.finalize() method restates descriptive information included in JDK-8165641 and in Roger Riggs's message. This includes the recommendations to "implement java.lang.AutoCloseable if appropriate" for "classes whose instances hold non-heap resources" and to "provide a method to enable explicit release of those resources." The descriptive information also states, "The {@link java.lang.ref.Cleaner} and {@link java.lang.ref.PhantomReference} provide more flexible and efficient ways to release resources when an object becomes unreachable." See JDK-8138696 for more background on JDK 9-introduced java.lang.ref.Cleaner. The deprecation of Object.finalize() includes the enhanced @Deprecated annotation to state since when the method has been deprecated [@Deprecated(since="9")].

Although the proposed deprecation of Object.finalize() won't remove the ability to use the Java finalizer or reduce any of its potential negative consequences, it will at least provide an even more obvious warning about the risks of using that approach and, as currently documented, provides better potential alternatives to be considered.

Thursday, March 2, 2017

Java's Finalizer is Still There

When I was first learning Java and transitioning from C++ to Java, I remember being told repeatedly and frequently reading that one should not treat the Java finalizer like C++ destructors and should not count on it. The frequency and insistent nature of this advice had such effect on me that I cannot recall the last time I wrote a finalize() method and I cannot recall ever having written one in all the years I've written, read, reviewed, maintained, modified, and debugged Java code. Until recently, however, the effects of finalize() were not something I thought much about, probably because I have not used finalize(). A recent experience with finalize() has moved the effects of Java finalizers from an "academic exercise" to a real issue "in the wild."

The method-level Javadoc document comment for Object.finalize() provides some interesting details on the Java finalizer. It begins by providing an overall description of the method, "Called by the garbage collector on an object when garbage collection determines that there are no more references to the object. A subclass overrides the finalize method to dispose of system resources or to perform other cleanup." Another portion of this Javadoc comment warns of a couple issues commonly associated with use of Java finalizers: "The Java programming language does not guarantee which thread will invoke the finalize method for any given object. It is guaranteed, however, that the thread that invokes finalize will not be holding any user-visible synchronization locks when finalize is invoked. If an uncaught exception is thrown by the finalize method, the exception is ignored and finalization of that object terminates."

Josh Bloch devotes an item in Effective Java to the subject of Java finalizers. Item 7 of Effective Java's Second Edition is titled simply and concisely, "Avoid finalizers." Although many of the items in Effective Java use verbs such as "Prefer" or "Consider," this item uses the stronger verb "Avoid." Bloch does outline some examples where finalizers might be used, but his description of the inherent issues that remain and the many things to consider to mitigate those issues persuade most of us to avoid them as much as possible.

Bloch starts Effective Java item "Avoid Finalizers" with the emphasized (in bold) statement, "Finalizers are unpredictable, often dangerous, and generally unnecessary." Bloch emphasizes that developers should "never do anything time-critical in a finalizer" because "there is no guarantee [Java finalizers will] be executed promptly" and he emphasizes that developers should "never depend on a finalizer to update critical persistent state" because there is "no guarantee that [Java finalizers will] get executed at all." Bloch cites that exceptions in finalizers are not caught and warns of the danger of this because "uncaught exceptions can leave objects in a corrupt state."

The negative effect of Java finalizers that I had recent experience with is also described by Bloch. His "Avoid finalizers" item emphasizes (in bold), "there is a severe performance penalty for using finalizers" because it takes considerably longer "to create and destroy objects with finalizers." In our case, we were using a third-party library that internally used Java class finalize() methods to deallocate native memory (C/C++ through JNI). Because there was a very large number of these objects of these classes with finalize() methods, it appears that the system thread that handles Java finalization was getting behind and was locking on objects it was finalizing.

Garbage collection was also impacted adversely with the collector kicking off more frequently than we'd normally see. We realized quickly that the garbage collection logs were indicating garbage collection issues that were not easily traceable to typical heap size issues or memory leaks of our own classes. Running the highly useful jcmd against the JVM process with jcmd <pid> GC.class_histogram helped us to see the underlying culprit quickly. That class histogram showed enough instances of java.lang.ref.Finalizer to warrant it being listed third from the top. Because that class is typically quite a bit further down the class histogram, I don't even typically see it or think about it. When we realized that three more of the the top eight instances depicted in the class histogram were three classes from the third-party library and they they implemented finalize() methods, we were able to explain the behavior and lay blame on the finalizers (four of the top eight classes in the histogram made it a pretty safe accusation).

The Java Language Specification provides several details related to Java finalizers in Section 12.6 ("Finalization of Class Instances"). The section begins by describing Java finalizers: "The particular definition of finalize() that can be invoked for an object is called the finalizer of that object. Before the storage for an object is reclaimed by the garbage collector, the Java Virtual Machine will invoke the finalizer of that object." Some of the intentionally indeterminate characteristics of Java finalizers described in this section of the Java Language Specification are quoted here (I have added any emphasis):

  • "The Java programming language does not specify how soon a finalizer will be invoked."
  • "The Java programming language does not specify which thread will invoke the finalizer for any given object."
  • "Finalizers may be called in any order, or even concurrently."
  • "If an uncaught exception is thrown during the finalization, the exception is ignored and finalization of that object terminates."

I found myself enjoying working with the team that resolved this issue because I was able to experience in "real life" what I had only read about and knew about in an "academic" sense. It is always satisfying to apply a favorite tool (such as jcmd) and to apply previous experiences (such as recognizing what looked out of place in the jcmd class histogram) to resolve a new issue.