Tuesday, May 14, 2019

Java Text Blocks

In the 13 May 2019 post "RFR: Multi-line String Literal (Preview) JEP [EG Draft]" on the OpenJDK amber-spec-experts mailing list, Jim Laskey announced a draft feature JEP named "Text Blocks (Preview)" (JDK-8222530).

Laskey's post opens with (I've added the links), "After some significant tweaks, reopening the JEP for review" and he is referring to the draft JEP that was started after the closing/withdrawing of JEP 326 ["Raw String Literals (Preview)"] (JDK-8196004). Laskey explains the most recent change to the draft JEP, "The most significant change is the renaming to Text Blocks (I'm sure it will devolve over time Text Literals or just Texts.) This is primarily to reflect the two-dimensionality of the new literal, whereas String literals are one-dimensional." This post-"raw string literals" draft JEP previously referred to "multi-line string literals" and now refers to "text blocks."

The draft JEP "Text Blocks (Preview)" provides detailed overview of the proposed preview feature. Its "Summary" section states:

Add text blocks to the Java language. A text block is a multi-line string literal that avoids the need for most escape sequences, automatically formats the string in predictable ways, and gives the developer control over format when desired. This will be a preview language feature.

This is a follow-on effort to explorations begun in JEP 326, Raw String Literals (Preview).

The draft JEP currently lists three "Goals" of the JEP and I've reproduced the first two here:

  1. "Simplify the task of writing Java programs by making it easy to express strings that span several lines of source code, while avoiding escape sequences in common cases."
  2. "Enhance the readability of strings in Java programs that denote code written in non-Java languages."

The "Non-Goals" of this draft JEP are also interesting and the two current non-goals are reproduced here:

  1. "It is not a goal to define a new reference type (distinct from java.lang.String) for the strings expressed by any new construct."
  2. "It is not a goal to define new operators (distinct from +) that take String operands."

The current "Description" of the draft JEP states:

A text block is a new kind of literal in the Java language. It may be used to denote a string anywhere that a string literal may be used, but offers greater expressiveness and less accidental complexity.

A text block consists of zero or more content characters, enclosed by opening and closing delimiters.

The draft JEP describes use of "fat delimiters" ("three double quote characters": ===) in the opening delimiter and closing delimiter that mark the beginning and ending of a "text block." As currently proposed, the text block actually begins on the line following the line terminator of the line with the opening delimiter (which might include spaces). The content of the text block ends with the final character before the closing delimiter.

The draft JEP describes "text block" treatment of some special characters. It states, 'The content may include " characters directly, unlike the characters in a string literal.' It also states that \" and \n are "permitted, but not necessary or recommended" in a text block. There is a section of this draft JEP that shows examples of "ill-formed text blocks."

There are numerous implementation details covered in the draft JEP. These include "compile-time processing" of line terminators ("normalized" to "to LF (\u000A)"), incidental white space (differentiation of "incidental white space from essential white space" and use of String::indent for custom indentation management), and escape sequences ("any escape sequences in the content are interpreted" per Java Language Specification and use of String::translateEscapes for custom escape processing).

Newly named "Java Text Blocks" look well-suited for the stated goals and the current proposal is the result of significant engineering effort. The draft JEP is approachable and worth reading for many details I did not cover here. Because this is still a draft JEP, it has not been proposed as a candidate JEP yet and has not been targeted to any specific Java release.

Monday, April 29, 2019

A New Era for Determining Equivalence in Java?

Liam Miller-Cushon has published a document simply called "Equivalence" in which he proposes "to create a library solution to help produce readable, correct, and performant implementations of equals() and hashCode()." In this post, I summarize some reasons why I believe this proposal is worth reading for most Java developers even if the proposal never gets implemented and why the proposal's implementation would benefit all Java developers if realized.

Miller-Cushon opens his proposal with a single-sentence paragraph: "Correctly implementing equals() and hashCode() requires too much ceremony." The proposal points out that today's powerful Java IDEs do a nice job of generating these methods, but that there is still code to be read and maintained. The proposal also mentions that "over time these methods become a place for bugs to hide." I have been on the wrong end more than once of particularly insidious bugs caused by an error in one of these methods and these can be tricky to detect.

All three editions of "Effective Java" provide detailed explanation and examples for how to write effective implementations of these methods, but it's still easy to get them wrong. The JDK 7 (Project Coin)-introduced methods Objects.equals(Object, Object) and Objects.hash(Object...) have helped considerably (especially in terms of readability and dealing with nulls properly), but there are still errors made in implementations of Object.equals(Object) and Object.hashCode().

Even if this "Equivalence" proposal never comes to fruition, there is some value in reading Miller-Cushon's document. One obvious benefit of this document is its capturing of "Examples of bugs in equals and hashCode implementations." There are currently nine bullets in this section describing the "wide array of bugs in implementations of equals and hashCode methods" that were often identified only when "static analysis to prevent these issues" was performed. These examples serve as a good reminder of the things to be careful about when writing implementations of these methods and also reminds us of the value of static analysis (note that Miller-Cushon is behind the static analysis tool error-prone).

Reading of the "Equivalence" document can also be enlightening for those wanting to better understand the related issues one should think about when developing the equivalence concept in Java. Through sets of questions in the "Requirements" and "Design Questions" sections, the document considers trade-offs and implementation choices that would need to be made. These cover topics such as how to handle nulls, instanceof versus getClass(), and the relationship to Comparator. Many of these considerations should probably be made today by Java developers implementing or maintaining their own implementations of equals(Object) and hashCode().

The "Related reading" section of the "Equivalence" document provides links to some interesting reading that includes the 2009 classic article "How to Write an Equality Method in Java" and Rémi Forax's ObjectSupport class (which delegates to ObjectSupports in some cases).

The "Equivalence" proposal was presented on the OpenJDK amber-spec-experts mailing list in a post title "A library for implementing equals and hashCode" and some of the feedback on that mailing list has led to updates to the document. One particularly interesting sentence for me in this discussion is Brian Goetz's statement, "That people routinely implement equals/hashCode explicitly is something we would like to put in the past." That seems like a welcome change!

Saturday, April 27, 2019

Two JEPs Proposed for JDK 13: Enhancing AppCDS and ZGC

Two JDK Enhancement Proposals (JEPs) were proposed for JDK 13 this week on the OpenJDK jdk-dev mailing list. Mark Reinhold posted these proposals in messages with titles that indicate the JEP topic: "JEP proposed to target JDK 13: 350: Dynamic CDS Archives" and "JEP proposed to target JDK 13: 351: ZGC: Uncommit Unused Memory".

The "Summary" of proposed JEP 350 ["Dynamic CDS Archives"] states, "Extend application class-data sharing to allow the dynamic archiving of classes at the end of Java application execution. The archived classes will include all loaded application classes and library classes that are not present in the default, base-layer CDS archive." JEP 310 introduced "Application Class-Data Sharing" (AKA "AppCDS") via JDK-8185996 and in conjunction with JDK 10.

JEP 351 ["ZGC: Uncommit Unused Memory"]'s "Summary" section states simply, "Enhance ZGC to return unused heap memory to the operating system." The "Motivation" section adds more background details, "ZGC does not currently uncommit and return memory to the operating system, even when that memory has been unused for a long time. This behavior is not optimal for all types of applications and environments, especially those where memory footprint is a concern." "ZGC" refers to the "Z Garbage Collector" and more details regarding it can be found on the OpenJDK ZGC page and on the ZGC Wiki page. The main project page states, "The goal of this project is to create a scalable low latency garbage collector capable of handling heaps ranging from a few gigabytes to multi terabytes in size, with GC pause times not exceeding 10ms."

Both proposed JEPs will be officially targeted for JDK 13 next week if no objections are raised or if any raised objections are "satisfactorily answered."

Monday, April 22, 2019

OpenJDK on GitHub

Project Skara was created "to ... investigate alternative SCM and code review options for the JDK source code, including options based upon Git rather than Mercurial, and including options hosted by third parties." The OpenJDK skara-dev mailing list included a post from Robin Westberg last week that announced, "We have added some additional read-only mirrors of a few different OpenJDK project repositories to the https://github.com/openjdk group..."

The read only OpenJDK repositories on GitHub will likely be more convenient for developers wanting to take advantage of the "open source" nature of OpenJDK to take a peek at its internals. More developers are likely to be comfortable with Git than with Mercurial. The GitHub-hosted repositories make it even easier to clone a given repository or to even fork it.

As of this writing, there are currently nine public repositories hosted on the OpenJDK GitHub site:

This is not the first time the OpenJDK has been mirrored on GitHub. There are 11 repositories in "Mirror of OpenJDK repositories": jdk (2017), jdk7u-jdk (2012), jdk7u (2012), openjdk-mirror-meta (2015), corba (2015), jaxp (2015), jdk7u-langtools (2012), jdk7u-jaxws (2012), jdk7u-jaxp (2012), jdk7u-hotspot (2012), and jdk7u-corba (2012). There is also a Project-Skara/jdk that was last updated in August 2018.

Project Skara is not finished and active development of OpenJDK continues on the Mercurial-based version control system. However, the availability of important OpenJDK repositories on GitHub should make it more convenient for Java developers to analyze OpenJDK source code.

Monday, April 15, 2019

April 2019 Update on Java Records

After Project Valhalla's "Value Types/Objects", the language feature I am perhaps the most excited to see come to Java is Project Amber's "Data Classes" (AKA "Records"). I wrote the post "Updates on Records (Data Classes for Java)" about this time last year and use this post to provide an update on my understanding of where the "records" proposal is now.

A good starting point for the current state of the "records" design work is Brian Goetz's February 2019 version of "Data Classes and Sealed Types for Java." In addition to providing background on the usefulness of "plain data carriers" being implemented with less overhead than with traditional Java classes and summarizing design decisions related to achieving that goal, this post also introduces noted Java developer personas Algebraic Annie, Boilerplate Billy, JavaBean Jerry, POJO Patty, Tuple Tommy, and Values Victor.

Here are some key observations that Goetz makes in the "Data Classes and Sealed Types for Java" document.

  • "Java asks all classes ... to pay equally for the cost of encapsulation -- but not all classes benefit equally from it."
  • Because "the cost of establishing and defending these boundaries ... is constant across classes, but the benefit is not, the cost may sometimes be out of line with the benefit."
  • "This is what Java developers mean by too much ceremony' -- not that the ceremony has no value, but that they're forced to invoke it even when it does not offer sufficient value."
  • "The encapsulation model that Java provides -- where the representation is entirely decoupled from construction, state access, and equality -- is just more than many classes need."
  • "... we prefer to start with a semantic goal: modeling data as data."
  • "The API for a data class models the state, the whole state, and nothing but the state. One consequence of this is that data classes are transparent; they give up their data freely to all requestors."
  • "We propose to surface data classes in the form of records; like an enum, a record is a restricted form of class. It declares its representation, and commits to an API that matches that representation. We pair this with another abstraction, sealed types, which can assert control over which other types may be its subclasses."
  • "Records use the same tactic as enums for aligning the boilerplate-to-information ratio: offer a constrained version of a more general feature that enables standard members to be derived. ... For records, we make a similar trade; we give up the flexibility to decouple the classes API from its state description, in return for getting a highly streamlined declaration (and more)."
  • Restrictions on currently proposed records include: "record fields cannot be mutable; no fields other than those in the state description are permitted; and records cannot extend other types or be extended."
  • "... an approach that is focused exclusively on boilerplate reduction for arbitrary code is guaranteed to merely create a new kind of boilerplate."
  • "...records are not intended to replace JavaBeans, or other mutable aggregates..."

One section of Goetz's post provides an overview of likely use cases for records. These usage cases (which include descriptions in the Goetz post) include multiple return values (something that Java developers seem to frequently use custom or library-provided tuples for), Data Transfer Objects (DTOs), compound map keys, messages, and value wrappers.

Goetz specifically addresses the question related to the records proposal, "Why not 'just' do tuples?" Goetz answers his own question with multiple reasons for using the data class/record concept rather than simply adding tuples to Java. I'm generally not a fan of tuples because I think they reduce the readability of Java code and, especially if the values in the tuple have the same data type, can lead to subtle errors. Goetz articulates similar thinking, "Classes and class members have meaningful names; tuples and tuple components do not. A central aspect of Java's philosophy is that names matter; a Person with properties firstName and lastName is clearer and safer than a tuple of String and String." I prefer getFirstName() and getLastName() to getLeft() and getRight() or to getX() and getY(), so this resonates with me.

Another section of the Goetz document that I want to emphasize is the section headlined "Are records the same as value types?" This section compares Project Valhalla's value types to Project Amber's data classes. Goetz writes, "Value types are primarily about enabling flat and dense layout of objects in memory. In exchange for giving up object identity ..., the runtime gains the ability to optimize the heap layout and calling conventions for values. With records, in exchange for giving up the ability to decouple a classes API from its representation, we gain a number of notational and semantic benefits. ... some values may still benefit from state encapsulation, and some records may still benefit from identity, so they are not the exact same trade."

There has been more discussion on the amber-spec-experts mailing list this week regarding what to call "data classes." Naming is important and notoriously difficult in software development and I appreciate this discussion because the arguments for various names have helped me to understand what the current thinking is about what "data classes" are currently envisioned to be and not to be. Here are some excerpts from this enlightening thread:

  • Rémi Forax likes "named tuples" because data classes are immutable and have some commonality with "nominal tuples."
  • Brian Goetz likes starting with "records are just nominal tuples" to "avoid picking that fight" between different groups of people with "two categories of preconceived notions" of what a tuple is.
  • Kevin Bourrillion adds, "Records have semantics, which makes them 'worlds' different from tuples. ... I think it's fair to say that all a record 'holds' is a 'tuple', but it's so much more. Record is to tuple as enum is to int."
  • Guy Steele adds, "Java `record` is to C `struct` as Java `enum` is to C `enum`."

I continue looking forward to getting "data classes" in Java at some point in the future and appreciate the effort being put into ensuring their successful adoption when added. When transitioning from C++ to Java, I missed the enum greatly, but the wait was worth it when Java introduced its own (more powerful and safer) enum. I hope for a similar feeling about Java data classes/records when we get to start using them.

Saturday, April 13, 2019

Viewing TLS Configuration with JDK 13

JDK 13 Early Access Build 16 is now available and one of the interesting additions it brings is the ability to have the keytool command-line tool display the current system's TLS configuration information. This is easier than trying to find supported TLS information in separate documentation and match that information to one's JDK vendor and version.

To see the TLS configuration details with JDK 13 Early Access Build 16, one simply needs to enter keytool -showinfo -tls on the command line, but I'll describe a few more things about this command in this post.

The next screen snapshot shows that the JDK I'm using for my examples is the JDK 13 Early Access Build 16 and demonstrates that the keytool usage now shows the tool including the -showinfo command.

Simply entering keytool without any commands or options results in the usage statement shown in the screen snapshot. The description for the -showinfo command is, "Displays security related information."

The next screen snapshot demonstrates the hint that is provided when one tries to use keytool -showinfo without an option ('Try "keytool -showinfo -tls".'). The image also shows the options associated with the keytool command -showinfo that are displayed when keytool -showinfo --help is entered.

The --help option used with the -showinfo command displays a -v option, but I found on my Windows installation that this -v option does not provide any additional value over simply using the -tls option. The next screen snapshot shows the results of attempting to use the -v option alone (without the -tls option):

When trying to use -v along with the keytool command -showinfo, we get an error message and a recommendation to try keytool -showinfo -tls instead. That does indeed work better as shown in the next screen snapshot that only shows partial results of what's returned.

The output from running keytool -showinfo -tls lists "Enabled Protocols" and "Enabled Cipher Suites." In this case, we see that the "enabled protocols" are TLSv1.3, TLSv1.2, TLSv1.1, and TLSv1.

I found it interesting to look at the code changes required to implement this new command and option for keytool. The implementation uses the JDK's javax.net.ssl.SSLContext class's getDefault() method to acquire the "default SSL context." The returned SSLContext instance's getSocketFactory() method is invoked and the createSocket() method is called on the returned instance of javax.net.ssl.SSLSocketFactory. The returned instance of javax.net.ssl.SSLSocket has two methods getEnabledProtocols() and getEnabledCipherSuites() that return the values shown above in the output from running keytool -showinfo -tls.

The addition to JDK 13's keytool command-line tool of the -showinfo command with its -tls option is available as of Early Access Build 16 and was delivered via JDK-8219861. It's also worth noting that JDK-8204636 may eventually lead to improvements for JDK's TLS 1.3 support.

Thursday, April 4, 2019

New Valhalla Developments: Forwarders and Poxes

In the post "Updated VM-bridges document" on the valhalla-spec-experts OpenJDK mailing list, Brian Goetz provides, "an updated doc on forwarding-reversing bridges in the VM." This was a topic at the late March 2019 Burlington (Massachusetts) Valhalla Offsite. A nice summary of Burlington Valhalla Offsite is provided by Karen Kinnear, in which she records that Goetz discussed "migration" using "field bridges" and "method bridges" as part of L20 phase of the new Valhalla phasing approach. Goetz's post adds significant background details. In this post, I reference a few of the more interesting points made in these posts, but both posts provide significantly more details than I'm providing here and are worth reading for anyone interested in the direction of Project Valhalla.

Forwarders

Although Goetz spends the majority of his post discussing the "migration aspects" of "forwarding+reversing bridges" and the of "forwarders," I found the historical background he provided in the introductory paragraphs to be a clean, concise summary of how we got to where we are now. Here are some brief statements from this historical section of the post that stood out to me (quotes here are very slightly modified for emphasis and to provide links):

  • "In the Java 1.0 days, `javac` was little more than an 'assembler' for the classfile format, translating source code to bytecode in a mostly 1:1 manner."
  • "Over time, we've seen small divergences between the language model and the classfile model, and each of these is a source of sharp edges."
  • "In Java 1.1 the addition of inner classes, and the mismatch between the accessibility model in the language and the JVM (the language treated a nest as a single entity; the JVM treat nest members as separate classes) required "access bridges" (`access$000` methods), which have been the source of various issues over the years."
  • "Twenty years later, these methods were obviated by Nest-based Access Control [jep181] -- which represents the choice to align the VM model to the language model"
  • "In Java 5, while we were able to keep the translation largely stable and transparent through the use of erasure..."
  • "... there was one point of misalignment; several situations ... could give rise to the situation where two or more method descriptors -- which the JVM treats as distinct methods -- are treated by the language as if they correspond to the same method."
  • "To fool the VM, the compiler emits "bridge methods" which forward invocations from one signature to another. And, as often happens when we try to fool the VM, it ultimately has its revenge."
  • "Java 5 introduced the ability to override a method but to provide a more specific return type. (Java 8 later extended this to bridges in interfaces as well.)"

After providing the historical background and summarizing the issues associated with these historical developments, Goetz moves onto to providing significantly more details about "bridge methods." He discusses the "anatomy of a bridge method" (forwarding/invokevirtual) and points out that "bridges are brittle" because "separate compilation can move bridges from where already-compiled code expects them to be to places it does not expect them." I particularly like Goetz's concise summary of the fundamental brittleness problem with bridge methods:

The basic problem with bridge methods is that the language views the two method descriptors as two faces of the same actual method, whereas the JVM sees them as distinct methods. (And, reflection also has to participate in the charade.)

Goetz discusses in his post the "limits of bridges" and points out that their limits are mostly encountered when attempting to migrate in a binary compatible fashion:

The problem of migration arises both from language evolution (Valhalla aims to enable compatible migrating from value-based classes to value types, and from erased generics to specialized), as well as from the ordinary evolution of libraries.

After providing some specific examples of where bridge methods fail migration attempts for fields and overridden methods, Goetz introduces "fowarders":

In this document, we attempt to learn from the history of bridges, and create a new mechanism -- "forwarders" -- that work with the JVM instead of against it. This raises the level of expressivity of classfiles and opens the possibility of greater laziness. It is possible that traditional bridging scenarios can eventually be handled by forwarders too...

Goetz devotes the remainder of the post to specifics of forwarders such as their invocation, using forwarders with fields, overriding forwarders, adaptation of forwarders, and type checking and corner cases associated with forwarders.

Poxes: Value Type Wrappers for Primitives

There was more discussed in the Valhalla Offsite than forwarding. One of Kinnear's notes that stood out to me stated, "Poxes: value type wrappers for primitives." Since the introduction of "autoboxing and unboxing" in JDK 1.5, Java developers have become comfortable with the conceptual relationship of "boxed"/"wrapper" reference types to their primitive counterparts. Therefore, it does make some sense to me to use similar terminology for the concept of value type "wrappers" of primitive counterparts being called "poxes." Goetz describes "poxing" in the post "Finding the spirit of L-World" and calls "creating a value box for primitives" "a 'pox'" and discusses the possibility of "adjust[ing] the compiler's boxing behavior (when boxing to `Object` or an interface)" to "prefer the pox to the box."

Updating Valhalla Value Types Phases

In her post "Updated phasing of Valhalla Value Types," Kinnear outlines Valhalla value types "Phases/AIs for Proposals" discussed at the Burlington Valhalla offsite meeting mentioned earlier. She calls out one major timing proposal in its own paragraph (I added the emphasis):

One important phasing is moving null-default value types and migrating value-based-classes to null-default value types was moved out to L20, which is after the initial preview.

Kinnear provides a list of "phases/APIs" categorized under one of three general milestone allocations: L10, L20, and L100. The two specific topics covered in my post (forwarding and poxes) appear to be currently proposed for L20.

Conclusion

Valhalla's value types are still a ways out before we'll see them in a General Availability JDK, but I appreciate the significant effort being invested in these value types and look forward to benefiting from them at some point in the future.