Monday, April 29, 2019

A New Era for Determining Equivalence in Java?

Liam Miller-Cushon has published a document simply called "Equivalence" in which he proposes "to create a library solution to help produce readable, correct, and performant implementations of equals() and hashCode()." In this post, I summarize some reasons why I believe this proposal is worth reading for most Java developers even if the proposal never gets implemented and why the proposal's implementation would benefit all Java developers if realized.

Miller-Cushon opens his proposal with a single-sentence paragraph: "Correctly implementing equals() and hashCode() requires too much ceremony." The proposal points out that today's powerful Java IDEs do a nice job of generating these methods, but that there is still code to be read and maintained. The proposal also mentions that "over time these methods become a place for bugs to hide." I have been on the wrong end more than once of particularly insidious bugs caused by an error in one of these methods and these can be tricky to detect.

All three editions of "Effective Java" provide detailed explanation and examples for how to write effective implementations of these methods, but it's still easy to get them wrong. The JDK 7 (Project Coin)-introduced methods Objects.equals(Object, Object) and Objects.hash(Object...) have helped considerably (especially in terms of readability and dealing with nulls properly), but there are still errors made in implementations of Object.equals(Object) and Object.hashCode().

Even if this "Equivalence" proposal never comes to fruition, there is some value in reading Miller-Cushon's document. One obvious benefit of this document is its capturing of "Examples of bugs in equals and hashCode implementations." There are currently nine bullets in this section describing the "wide array of bugs in implementations of equals and hashCode methods" that were often identified only when "static analysis to prevent these issues" was performed. These examples serve as a good reminder of the things to be careful about when writing implementations of these methods and also reminds us of the value of static analysis (note that Miller-Cushon is behind the static analysis tool error-prone).

Reading of the "Equivalence" document can also be enlightening for those wanting to better understand the related issues one should think about when developing the equivalence concept in Java. Through sets of questions in the "Requirements" and "Design Questions" sections, the document considers trade-offs and implementation choices that would need to be made. These cover topics such as how to handle nulls, instanceof versus getClass(), and the relationship to Comparator. Many of these considerations should probably be made today by Java developers implementing or maintaining their own implementations of equals(Object) and hashCode().

The "Related reading" section of the "Equivalence" document provides links to some interesting reading that includes the 2009 classic article "How to Write an Equality Method in Java" and Rémi Forax's ObjectSupport class (which delegates to ObjectSupports in some cases).

The "Equivalence" proposal was presented on the OpenJDK amber-spec-experts mailing list in a post title "A library for implementing equals and hashCode" and some of the feedback on that mailing list has led to updates to the document. One particularly interesting sentence for me in this discussion is Brian Goetz's statement, "That people routinely implement equals/hashCode explicitly is something we would like to put in the past." That seems like a welcome change!

Saturday, April 27, 2019

Two JEPs Proposed for JDK 13: Enhancing AppCDS and ZGC

Two JDK Enhancement Proposals (JEPs) were proposed for JDK 13 this week on the OpenJDK jdk-dev mailing list. Mark Reinhold posted these proposals in messages with titles that indicate the JEP topic: "JEP proposed to target JDK 13: 350: Dynamic CDS Archives" and "JEP proposed to target JDK 13: 351: ZGC: Uncommit Unused Memory".

The "Summary" of proposed JEP 350 ["Dynamic CDS Archives"] states, "Extend application class-data sharing to allow the dynamic archiving of classes at the end of Java application execution. The archived classes will include all loaded application classes and library classes that are not present in the default, base-layer CDS archive." JEP 310 introduced "Application Class-Data Sharing" (AKA "AppCDS") via JDK-8185996 and in conjunction with JDK 10.

JEP 351 ["ZGC: Uncommit Unused Memory"]'s "Summary" section states simply, "Enhance ZGC to return unused heap memory to the operating system." The "Motivation" section adds more background details, "ZGC does not currently uncommit and return memory to the operating system, even when that memory has been unused for a long time. This behavior is not optimal for all types of applications and environments, especially those where memory footprint is a concern." "ZGC" refers to the "Z Garbage Collector" and more details regarding it can be found on the OpenJDK ZGC page and on the ZGC Wiki page. The main project page states, "The goal of this project is to create a scalable low latency garbage collector capable of handling heaps ranging from a few gigabytes to multi terabytes in size, with GC pause times not exceeding 10ms."

Both proposed JEPs will be officially targeted for JDK 13 next week if no objections are raised or if any raised objections are "satisfactorily answered."

Monday, April 22, 2019

OpenJDK on GitHub

Project Skara was created "to ... investigate alternative SCM and code review options for the JDK source code, including options based upon Git rather than Mercurial, and including options hosted by third parties." The OpenJDK skara-dev mailing list included a post from Robin Westberg last week that announced, "We have added some additional read-only mirrors of a few different OpenJDK project repositories to the https://github.com/openjdk group..."

The read only OpenJDK repositories on GitHub will likely be more convenient for developers wanting to take advantage of the "open source" nature of OpenJDK to take a peek at its internals. More developers are likely to be comfortable with Git than with Mercurial. The GitHub-hosted repositories make it even easier to clone a given repository or to even fork it.

As of this writing, there are currently nine public repositories hosted on the OpenJDK GitHub site:

This is not the first time the OpenJDK has been mirrored on GitHub. There are 11 repositories in "Mirror of OpenJDK repositories": jdk (2017), jdk7u-jdk (2012), jdk7u (2012), openjdk-mirror-meta (2015), corba (2015), jaxp (2015), jdk7u-langtools (2012), jdk7u-jaxws (2012), jdk7u-jaxp (2012), jdk7u-hotspot (2012), and jdk7u-corba (2012). There is also a Project-Skara/jdk that was last updated in August 2018.

Project Skara is not finished and active development of OpenJDK continues on the Mercurial-based version control system. However, the availability of important OpenJDK repositories on GitHub should make it more convenient for Java developers to analyze OpenJDK source code.

Monday, April 15, 2019

April 2019 Update on Java Records

After Project Valhalla's "Value Types/Objects", the language feature I am perhaps the most excited to see come to Java is Project Amber's "Data Classes" (AKA "Records"). I wrote the post "Updates on Records (Data Classes for Java)" about this time last year and use this post to provide an update on my understanding of where the "records" proposal is now.

A good starting point for the current state of the "records" design work is Brian Goetz's February 2019 version of "Data Classes and Sealed Types for Java." In addition to providing background on the usefulness of "plain data carriers" being implemented with less overhead than with traditional Java classes and summarizing design decisions related to achieving that goal, this post also introduces noted Java developer personas Algebraic Annie, Boilerplate Billy, JavaBean Jerry, POJO Patty, Tuple Tommy, and Values Victor.

Here are some key observations that Goetz makes in the "Data Classes and Sealed Types for Java" document.

  • "Java asks all classes ... to pay equally for the cost of encapsulation -- but not all classes benefit equally from it."
  • Because "the cost of establishing and defending these boundaries ... is constant across classes, but the benefit is not, the cost may sometimes be out of line with the benefit."
  • "This is what Java developers mean by too much ceremony' -- not that the ceremony has no value, but that they're forced to invoke it even when it does not offer sufficient value."
  • "The encapsulation model that Java provides -- where the representation is entirely decoupled from construction, state access, and equality -- is just more than many classes need."
  • "... we prefer to start with a semantic goal: modeling data as data."
  • "The API for a data class models the state, the whole state, and nothing but the state. One consequence of this is that data classes are transparent; they give up their data freely to all requestors."
  • "We propose to surface data classes in the form of records; like an enum, a record is a restricted form of class. It declares its representation, and commits to an API that matches that representation. We pair this with another abstraction, sealed types, which can assert control over which other types may be its subclasses."
  • "Records use the same tactic as enums for aligning the boilerplate-to-information ratio: offer a constrained version of a more general feature that enables standard members to be derived. ... For records, we make a similar trade; we give up the flexibility to decouple the classes API from its state description, in return for getting a highly streamlined declaration (and more)."
  • Restrictions on currently proposed records include: "record fields cannot be mutable; no fields other than those in the state description are permitted; and records cannot extend other types or be extended."
  • "... an approach that is focused exclusively on boilerplate reduction for arbitrary code is guaranteed to merely create a new kind of boilerplate."
  • "...records are not intended to replace JavaBeans, or other mutable aggregates..."

One section of Goetz's post provides an overview of likely use cases for records. These usage cases (which include descriptions in the Goetz post) include multiple return values (something that Java developers seem to frequently use custom or library-provided tuples for), Data Transfer Objects (DTOs), compound map keys, messages, and value wrappers.

Goetz specifically addresses the question related to the records proposal, "Why not 'just' do tuples?" Goetz answers his own question with multiple reasons for using the data class/record concept rather than simply adding tuples to Java. I'm generally not a fan of tuples because I think they reduce the readability of Java code and, especially if the values in the tuple have the same data type, can lead to subtle errors. Goetz articulates similar thinking, "Classes and class members have meaningful names; tuples and tuple components do not. A central aspect of Java's philosophy is that names matter; a Person with properties firstName and lastName is clearer and safer than a tuple of String and String." I prefer getFirstName() and getLastName() to getLeft() and getRight() or to getX() and getY(), so this resonates with me.

Another section of the Goetz document that I want to emphasize is the section headlined "Are records the same as value types?" This section compares Project Valhalla's value types to Project Amber's data classes. Goetz writes, "Value types are primarily about enabling flat and dense layout of objects in memory. In exchange for giving up object identity ..., the runtime gains the ability to optimize the heap layout and calling conventions for values. With records, in exchange for giving up the ability to decouple a classes API from its representation, we gain a number of notational and semantic benefits. ... some values may still benefit from state encapsulation, and some records may still benefit from identity, so they are not the exact same trade."

There has been more discussion on the amber-spec-experts mailing list this week regarding what to call "data classes." Naming is important and notoriously difficult in software development and I appreciate this discussion because the arguments for various names have helped me to understand what the current thinking is about what "data classes" are currently envisioned to be and not to be. Here are some excerpts from this enlightening thread:

  • Rémi Forax likes "named tuples" because data classes are immutable and have some commonality with "nominal tuples."
  • Brian Goetz likes starting with "records are just nominal tuples" to "avoid picking that fight" between different groups of people with "two categories of preconceived notions" of what a tuple is.
  • Kevin Bourrillion adds, "Records have semantics, which makes them 'worlds' different from tuples. ... I think it's fair to say that all a record 'holds' is a 'tuple', but it's so much more. Record is to tuple as enum is to int."
  • Guy Steele adds, "Java `record` is to C `struct` as Java `enum` is to C `enum`."

I continue looking forward to getting "data classes" in Java at some point in the future and appreciate the effort being put into ensuring their successful adoption when added. When transitioning from C++ to Java, I missed the enum greatly, but the wait was worth it when Java introduced its own (more powerful and safer) enum. I hope for a similar feeling about Java data classes/records when we get to start using them.

Saturday, April 13, 2019

Viewing TLS Configuration with JDK 13

JDK 13 Early Access Build 16 is now available and one of the interesting additions it brings is the ability to have the keytool command-line tool display the current system's TLS configuration information. This is easier than trying to find supported TLS information in separate documentation and match that information to one's JDK vendor and version.

To see the TLS configuration details with JDK 13 Early Access Build 16, one simply needs to enter keytool -showinfo -tls on the command line, but I'll describe a few more things about this command in this post.

The next screen snapshot shows that the JDK I'm using for my examples is the JDK 13 Early Access Build 16 and demonstrates that the keytool usage now shows the tool including the -showinfo command.

Simply entering keytool without any commands or options results in the usage statement shown in the screen snapshot. The description for the -showinfo command is, "Displays security related information."

The next screen snapshot demonstrates the hint that is provided when one tries to use keytool -showinfo without an option ('Try "keytool -showinfo -tls".'). The image also shows the options associated with the keytool command -showinfo that are displayed when keytool -showinfo --help is entered.

The --help option used with the -showinfo command displays a -v option, but I found on my Windows installation that this -v option does not provide any additional value over simply using the -tls option. The next screen snapshot shows the results of attempting to use the -v option alone (without the -tls option):

When trying to use -v along with the keytool command -showinfo, we get an error message and a recommendation to try keytool -showinfo -tls instead. That does indeed work better as shown in the next screen snapshot that only shows partial results of what's returned.

The output from running keytool -showinfo -tls lists "Enabled Protocols" and "Enabled Cipher Suites." In this case, we see that the "enabled protocols" are TLSv1.3, TLSv1.2, TLSv1.1, and TLSv1.

I found it interesting to look at the code changes required to implement this new command and option for keytool. The implementation uses the JDK's javax.net.ssl.SSLContext class's getDefault() method to acquire the "default SSL context." The returned SSLContext instance's getSocketFactory() method is invoked and the createSocket() method is called on the returned instance of javax.net.ssl.SSLSocketFactory. The returned instance of javax.net.ssl.SSLSocket has two methods getEnabledProtocols() and getEnabledCipherSuites() that return the values shown above in the output from running keytool -showinfo -tls.

The addition to JDK 13's keytool command-line tool of the -showinfo command with its -tls option is available as of Early Access Build 16 and was delivered via JDK-8219861. It's also worth noting that JDK-8204636 may eventually lead to improvements for JDK's TLS 1.3 support.

Thursday, April 4, 2019

New Valhalla Developments: Forwarders and Poxes

In the post "Updated VM-bridges document" on the valhalla-spec-experts OpenJDK mailing list, Brian Goetz provides, "an updated doc on forwarding-reversing bridges in the VM." This was a topic at the late March 2019 Burlington (Massachusetts) Valhalla Offsite. A nice summary of Burlington Valhalla Offsite is provided by Karen Kinnear, in which she records that Goetz discussed "migration" using "field bridges" and "method bridges" as part of L20 phase of the new Valhalla phasing approach. Goetz's post adds significant background details. In this post, I reference a few of the more interesting points made in these posts, but both posts provide significantly more details than I'm providing here and are worth reading for anyone interested in the direction of Project Valhalla.

Forwarders

Although Goetz spends the majority of his post discussing the "migration aspects" of "forwarding+reversing bridges" and the of "forwarders," I found the historical background he provided in the introductory paragraphs to be a clean, concise summary of how we got to where we are now. Here are some brief statements from this historical section of the post that stood out to me (quotes here are very slightly modified for emphasis and to provide links):

  • "In the Java 1.0 days, `javac` was little more than an 'assembler' for the classfile format, translating source code to bytecode in a mostly 1:1 manner."
  • "Over time, we've seen small divergences between the language model and the classfile model, and each of these is a source of sharp edges."
  • "In Java 1.1 the addition of inner classes, and the mismatch between the accessibility model in the language and the JVM (the language treated a nest as a single entity; the JVM treat nest members as separate classes) required "access bridges" (`access$000` methods), which have been the source of various issues over the years."
  • "Twenty years later, these methods were obviated by Nest-based Access Control [jep181] -- which represents the choice to align the VM model to the language model"
  • "In Java 5, while we were able to keep the translation largely stable and transparent through the use of erasure..."
  • "... there was one point of misalignment; several situations ... could give rise to the situation where two or more method descriptors -- which the JVM treats as distinct methods -- are treated by the language as if they correspond to the same method."
  • "To fool the VM, the compiler emits "bridge methods" which forward invocations from one signature to another. And, as often happens when we try to fool the VM, it ultimately has its revenge."
  • "Java 5 introduced the ability to override a method but to provide a more specific return type. (Java 8 later extended this to bridges in interfaces as well.)"

After providing the historical background and summarizing the issues associated with these historical developments, Goetz moves onto to providing significantly more details about "bridge methods." He discusses the "anatomy of a bridge method" (forwarding/invokevirtual) and points out that "bridges are brittle" because "separate compilation can move bridges from where already-compiled code expects them to be to places it does not expect them." I particularly like Goetz's concise summary of the fundamental brittleness problem with bridge methods:

The basic problem with bridge methods is that the language views the two method descriptors as two faces of the same actual method, whereas the JVM sees them as distinct methods. (And, reflection also has to participate in the charade.)

Goetz discusses in his post the "limits of bridges" and points out that their limits are mostly encountered when attempting to migrate in a binary compatible fashion:

The problem of migration arises both from language evolution (Valhalla aims to enable compatible migrating from value-based classes to value types, and from erased generics to specialized), as well as from the ordinary evolution of libraries.

After providing some specific examples of where bridge methods fail migration attempts for fields and overridden methods, Goetz introduces "fowarders":

In this document, we attempt to learn from the history of bridges, and create a new mechanism -- "forwarders" -- that work with the JVM instead of against it. This raises the level of expressivity of classfiles and opens the possibility of greater laziness. It is possible that traditional bridging scenarios can eventually be handled by forwarders too...

Goetz devotes the remainder of the post to specifics of forwarders such as their invocation, using forwarders with fields, overriding forwarders, adaptation of forwarders, and type checking and corner cases associated with forwarders.

Poxes: Value Type Wrappers for Primitives

There was more discussed in the Valhalla Offsite than forwarding. One of Kinnear's notes that stood out to me stated, "Poxes: value type wrappers for primitives." Since the introduction of "autoboxing and unboxing" in JDK 1.5, Java developers have become comfortable with the conceptual relationship of "boxed"/"wrapper" reference types to their primitive counterparts. Therefore, it does make some sense to me to use similar terminology for the concept of value type "wrappers" of primitive counterparts being called "poxes." Goetz describes "poxing" in the post "Finding the spirit of L-World" and calls "creating a value box for primitives" "a 'pox'" and discusses the possibility of "adjust[ing] the compiler's boxing behavior (when boxing to `Object` or an interface)" to "prefer the pox to the box."

Updating Valhalla Value Types Phases

In her post "Updated phasing of Valhalla Value Types," Kinnear outlines Valhalla value types "Phases/AIs for Proposals" discussed at the Burlington Valhalla offsite meeting mentioned earlier. She calls out one major timing proposal in its own paragraph (I added the emphasis):

One important phasing is moving null-default value types and migrating value-based-classes to null-default value types was moved out to L20, which is after the initial preview.

Kinnear provides a list of "phases/APIs" categorized under one of three general milestone allocations: L10, L20, and L100. The two specific topics covered in my post (forwarding and poxes) appear to be currently proposed for L20.

Conclusion

Valhalla's value types are still a ways out before we'll see them in a General Availability JDK, but I appreciate the significant effort being invested in these value types and look forward to benefiting from them at some point in the future.