Thursday, January 17, 2019

Using Minimum Fractional Digits with JDK 12 Compact Number Formatting

The post "Compact Number Formatting Comes to JDK 12" demonstrated the support added to NumberFormat in JDK 12 to support compact number formatting. The examples shown in that post only used the instances of NumberFormat returned by invocations of NumberFormat's new overloaded getCompactNumberInstance(-) methods and so therefore did not specify characteristics such as minimum fractional digits and maximum fractional digits. The results, in some cases, are less than desirable. Fortunately, NumberFormat does allow for minimum and maximum fractional digits to be specified and this post demonstrates how that can improve the output of the compact number formatting available with JDK 12.

The code listing introduced in the original "Compact Number Formatting Comes to JDK 12" post (and which is available on GitHub) has been updated to demonstrate use of NumberFormat.setMinimumFractionDigits(int). An excerpt of that code is shown next and is followed by the accompanying output.

 * Generates standardized map of labels to Compact Number Format
 * instances described by the labels. The instances of {@code NumberFormat}
 * are created with Locale and Style only and with the provided number
 * of minimum fractional digits.
 * @return Mapping of label to an instance of a Compact Number Format
 *    consisting of a Locale, Style, and specified minimum number of fractional
 *    digits that is described by the label.
private static Map<String, NumberFormat> generateCompactNumberFormats(
   final int minimumNumberFractionDigits)
   var numberFormats = generateCompactNumberFormats();
   numberFormats.forEach((label, numberFormat) ->
   return numberFormats;

 * Demonstrates compact number formatting in a variety of locales
 * and number formats against the provided {@code long} value and
 * with a minimum fractional digits of 1 specified.
 * @param numberToFormat Value of type {@code long} that is to be
 *    formatted using compact number formatting and a variety of
 *    locales and number formats and with a single minimal fractional
 *    digit.
private static void demonstrateCompactNumberFormattingOneFractionalDigitMinimum(
   final long numberToFormat)
   final Map<String, NumberFormat> numberFormats = generateCompactNumberFormats(1);
      "Demonstrating Compact Number Formatting on long '" + numberToFormat
         + "' with 1 minimum fraction digit:");
   numberFormats.forEach((label, numberFormat) ->
      out.println("\t" +  label + ": " + numberFormat.format(numberToFormat))
Demonstrating Compact Number Formatting on long '15' with 1 minimum fraction digit:
 Default: 15
 US/Long: 15
 UK/Short: 15
 UK/Long: 15
 FR/Short: 15
 FR/Long: 15
 DE/Short: 15
 DE/Long: 15
 IT/Short: 15
 IT/Long: 15
Demonstrating Compact Number Formatting on long '150' with 1 minimum fraction digit:
 Default: 150
 US/Long: 150
 UK/Short: 150
 UK/Long: 150
 FR/Short: 150
 FR/Long: 150
 DE/Short: 150
 DE/Long: 150
 IT/Short: 150
 IT/Long: 150
Demonstrating Compact Number Formatting on long '1500' with 1 minimum fraction digit:
 Default: 1.5K
 US/Long: 1.5 thousand
 UK/Short: 1.5K
 UK/Long: 1.5 thousand
 FR/Short: 1,5 k
 FR/Long: 1,5 millier
 DE/Short: 1.500
 DE/Long: 1,5 Tausend
 IT/Short: 1.500
 IT/Long: 1,5 mille
Demonstrating Compact Number Formatting on long '15000' with 1 minimum fraction digit:
 Default: 15.0K
 US/Long: 15.0 thousand
 UK/Short: 15.0K
 UK/Long: 15.0 thousand
 FR/Short: 15,0 k
 FR/Long: 15,0 mille
 DE/Short: 15.000
 DE/Long: 15,0 Tausend
 IT/Short: 15.000
 IT/Long: 15,0 mila
Demonstrating Compact Number Formatting on long '150000' with 1 minimum fraction digit:
 Default: 150.0K
 US/Long: 150.0 thousand
 UK/Short: 150.0K
 UK/Long: 150.0 thousand
 FR/Short: 150,0 k
 FR/Long: 150,0 mille
 DE/Short: 150.000
 DE/Long: 150,0 Tausend
 IT/Short: 150.000
 IT/Long: 150,0 mila
Demonstrating Compact Number Formatting on long '1500000' with 1 minimum fraction digit:
 Default: 1.5M
 US/Long: 1.5 million
 UK/Short: 1.5M
 UK/Long: 1.5 million
 FR/Short: 1,5 M
 FR/Long: 1,5 million
 DE/Short: 1,5 Mio.
 DE/Long: 1,5 Million
 IT/Short: 1,5 Mln
 IT/Long: 1,5 milione
Demonstrating Compact Number Formatting on long '15000000' with 1 minimum fraction digit:
 Default: 15.0M
 US/Long: 15.0 million
 UK/Short: 15.0M
 UK/Long: 15.0 million
 FR/Short: 15,0 M
 FR/Long: 15,0 million
 DE/Short: 15,0 Mio.
 DE/Long: 15,0 Millionen
 IT/Short: 15,0 Mln
 IT/Long: 15,0 milioni

As the example and output shown above demonstrate, use of NumberFormat.setMinimumFractionDigits(int) leads to compact number formatted output that is likely to be more aesthetically pleasing in many cases. There is a recent discussion "Compact Number Formatting and Fraction Digits" on the OpenJDK core-libs-dev mailing list that also discusses this ability to customize the compact number formatting output.

Wednesday, January 9, 2019

Proposal for Wider Range of Available Java Keywords

In the recent posts titled "We need more keywords, captain!" on the OpenJDK amber-spec-experts mailing list, Brian Goetz "proposes a possible move that will buy us some breathing room in the perpetual problem where the keyword-management tail wags the programming-model dog." His proposal is to "allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers."

Goetz points out in the original post that Section 3.9 ("Keywords") of the Java Language Specification spells out the current keywords in Java and that these chosen keywords have been stable since Java's inception with the only changes being the addition of the highlighted keywords shown below (assert in JDK 1.4, enum in JDK 1.5, and _ in JDK 1.9):

abstract   continue   for          new         switch
assert     default    if           package     synchronized
boolean    do         goto         private     this
break      double     implements   protected   throw
byte       else       import       public      throws
case       enum       instanceof   return      transient
catch      extends    int          short       try
char       final      interface    static      void
class      finally    long         strictfp    volatile
const      float      native       super       while
_ (underscore)

Goetz's post writes about "several tools at our disposal" that have been or could be used when the set of pre-established keywords are not "suitable for expressing all the things we might ever want our language to express." Goetz emphasizes this point, "The lack of reasonable options for extending the syntax of the language threatens to become a significant impediment to language evolution."

Goetz provides significant background explanation regarding the downsides of the "several tools" he had mentioned earlier in the post and states, "We need a new source of keyword candidates." Goetz then proposes to "allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers."

The Goetz post provides several examples to illustrate hyphenated keywords and how they might be used. Goetz emphasizes that this list is not in any way proposing these specific keywords for now, but is only providing them as illustrative examples. See the original post for brief descriptions of how some of these illustrative examples might be used.

  • non-null
  • non-final
  • package-private
  • public-read
  • null-checked
  • type-static
  • default-value
  • eventually-final
  • semi-final
  • exhaustive-switch
  • enum-class
  • annotation-class
  • record-class
  • this-class
  • this-return

Two other posts on the mailing list have already built on this original message. In a reply to his own message, Goetz writes that the expression switch could have (or still might be able to because switch expressions are currently a Preview Feature) support a keyword such as break-with rather than requiring developers to "disambiguate whether [a given break is a] labeled break or a value break." Guy Steele replied that he prefers break-return to break-with and that this particular single illustrative example "has [maybe] won me over to the idea of hyphenated keywords."

The Goetz post "We need more keywords, captain!" has seen some attention on Twitter as well. Sander Mak tweeted, "Another example of real-world programming language design by @BrianGoetz: ... Love the diligence and forward-looking nature of these #Java language design discussions." Bruno Borges tweeted a quote from that same post: "We are convinced that @Java has a long life ahead of it, and developers are excited about new features that enable to them to write more expressive and reliable code."

The hyphenated keyword approach seems promising. Regardless of what happens, the Goetz post makes for an interesting read regarding the difficulty of adding or reapplying keywords to a long-lived programming language concerned about backwards compatibility. The post is also offers a peek into the types of issues, constraints, and trade-offs language designers must make.

Tuesday, January 8, 2019

Explicitly Naming Automatic Java Modules

Nicolas Fränkel recently published the surprising post "A hard look at the state of Java modularization." In that post, Fränkel provides the results of his investigation into support available in the 29 libraries referenced in the blog post "20 popular Java libraries" for modules introduced with JDK 9. Fränkel's investigation aimed to identify which of these popular Java libraries was "modularized" (fully implemented module defined in or provided at least an "automatic module name" via MANIFEST.MF even if the library isn't modularized.

Of the 29 popular Java libraries investigated, Fränkel identified only two (SLF4J 1.8.0-beta2 and JAXB 2.3.1) that are fully modularized. Of the remaining 27 libraries that are not fully modularized, 12 do have an automatic module name defined. That means, of course, that 15 of the 29 libraries (just over 50%) do not have any explicit support for modularity introduced with JDK 9!

In the post "Automatic-Module-Name: Calling all Java Library Maintainers" a little over one year ago, Sander Mak (one of the authors of Java 9 Modularity) describes "what needs to be done to move the Java library ecosystem toward modules." Mak explains that "support for the Java module system can be incrementally added to libraries" and uses this post to "explain the first step ... to becoming a Java module." Mak writes:

This first step boils down to picking a module name, and adding it as Automatic-Module-Name: <module name> entry to the library's MANIFEST.MF. That's it. With this first step you make your library usable as Java module without moving the library itself to Java 9 or creating a module descriptor for the library, yet.

In the "Automatic-Module-Name: Calling all Java Library Maintainers" post, Mak also provides guidance for providing an automatic name for a module. He recommends picking an explicit name for the module rather than relying on the ModuleFinder-based name derivation algorithm (module named based on JAR filename). Mak references Stephen Colebourne's post "Java SE 9 - JPMS automatic modules," in which Colebourne concludes, "Community members must at all costs avoid publishing modular jar files that depend on filenames." Incidentally, Colebourne's post "Java SE 9 - JPMS module naming" provides additional guidance on module naming.

The naming of the automatic module is significant because later changes to that name will cause backwards incompatibilities for the library. It is also important to not have the module name collide with others' libraries' module names. The recommended way of doing this is to use the root package name contained within the module, assuming that package uses the typical Java package naming convention to ensure uniqueness.

Mak also outlines in his post some "potential issues you need to verify" before adding the Automatic-Module-Name entry to the MANIFEST.MF file to avoid "false expectations." See the "Sanity-Check Your Library" section of Mak's post for the full list and detailed description of these issues which include not using internal JDK types and not having any classes in the unnamed package.

Before concluding my post, I am going to briefly present the difference between an explicitly named automatic module and an implicitly named automatic module. For this, I'll be using a JAR file generated from the code example in my previous post "Parsing Value from StreamCorruptedException: invalid stream header Message." The Java source is not that important for my purposes here other than to point out that the main class used in my post has the package name

The generated JAR file, which I've called io-examples.jar is not modularized (does not have a module-info file). The jar tool's option --describe-module can be used to quickly determine the automatic module name of the JAR file.

The next two screen snapshots show the results of running jar with the --describe-module option against the JAR. The first screen snapshot indicates the results when the JAR has nothing in its MANIFEST.MF file to indicate an automatic module name. The result is a automatic module named after the JAR name. The second screen snapshot shows the results from running jar --describe-module against the almost identical JAR except for the addition of the attribute Automatic-Module-Name: to the JAR's MANIFEST.MF file. In that case, the automatic module is named the explicitly provided name provided in the manifest file (

Fränkel opens his post with the assertion, "With the coming of Java 11, the latest Long-Term Support, I think it's a good time to take a snapshot of the state of modularization." I too think that recent and pending changes make it more important for all of us in the Java community to start understanding Java's built-in modularity and its implications and to take steps toward more complete modularity support.

Additional (Previously Referenced) Resources

Monday, January 7, 2019

The JDK 13 Train Has Left the Station

JDK 12 [Java SE 12 Platform (JSR 386)] is still in Rampdown Phase 1, but initial work on JDK 13 [Java SE 13 Platform (JSR 388)] has already begun. Draft 26 of the Java SE 12 specification was announced approximately 10 hours before Draft 2 of the Java SE 13 specification was announced.

There are already early access builds of JDK 13 available for Linux, macOS, Windows, and Alpine Linux. As of this writing, the current JDK 13 early access build is #2 (3 January 2019). The JDK 13 Early-Access Release Notes do not yet contain anything of significance.

There are ten change sets associated with Build 2 and another ten change sets associated with JDK 13 Early Access Build 1. There are 33 issues addressed with that build (688 total bugs associated with JDK 13 for "build fix" as this writing).

The temporary home of the Javadoc-based API documentation for JDK 13 is also available at

The OpenJDK JDK 13 project page describes the project's status: "The development repositories are open for bug fixes, small enhancements, and JEPs as proposed and tracked via the JEP Process."

Thursday, January 3, 2019

Parsing Value from StreamCorruptedException: invalid stream header Message

It is a relatively common occurrence to see StreamCorruptedExceptions thrown with a "reason" that states, "invalid stream header" and then provides the first part of that invalid stream header. Frequently, a helpful clue for identifying the cause of that exception is to understand what the invalid stream header is because that explains what is unexpected and causing the issue.

The StreamCorruptedException only has two constructors, one that accepts no arguments and one that accepts a single String describing the exception's "reason". This tells us that the "invalid stream header: XXXXXXXX" messages (where XXXXXXXX represents various invalid header details) are provided by the code that instantiates (and presumably throws) these StreamCorruptedExceptions rather than by that exception class itself. This means that it won't always necessarily be the same formatted message encountered with one of these exceptions, but in most common cases, the format is the same with "invalid stream header: " followed by the first portion of that invalid stream header.

This exception is commonly thrown by an ObjectInputStream. The Javadoc for that class has some useful details that help explain why the "StreamCorruptedException: invalid stream header" is encountered. The class-level Javadoc states, "Only objects that support the or interface can be read from streams." The Javadoc for the ObjectInputStream​(InputStream) constructor states (I added the emphasis), "Creates an ObjectInputStream that reads from the specified InputStream. A serialization stream header is read from the stream and verified."

As the quoted Javadoc explains, ObjectInputStream should be used with serialized data. Many of the cases of the "StreamCorruptedException: invalid stream header" message occur when a text file (such as HTML, XML, JSON, etc.) is passed to this constructor rather than a Java serialized file.

The following are examples of "ASCII" values derived from "invalid stream header" messages associated with StreamCorruptedExceptions and reported online.

Invalid Stream Header Value (HEX) Corresponding Integers Corresponding
"ASCII" Value
Online References / Examples
00000000 000 000 000 000
0A0A0A0A 010 010 010 010
0A0A3C68 010 010 060 104

20646520 032 100 101 032 de
30313031 048 049 048 049 0101
32303138 050 048 049 056 2018
3C21444F 060 033 068 079 <!DO
3c48544d 060 072 084 077 <HTM
3C6F626A 060 111 098 106 <obj  
3C787364 060 120 115 100 <xsd
41434544 065 067 069 068 ACED
48656C6C 072 101 108 108 Hell
4920616D 073 032 097 109 I am
54656D70 084 101 109 112 Temp
54657374 084 101 115 116 Test invalid stream header: 54657374
54686973 084 104 105 115 This
64617364 100 097 115 100 dasd
70707070 112 112 112 112 pppp
72657175 114 101 113 117 requ
7371007E 115 113 000 126 sq ~
77617161 119 097 113 097 waqa
7B227061 123 034 112 097 {"pa

The above examples show the "StreamCorruptedException: invalid stream header" message occurring for cases where input streams representing text were passed to the constructor that expects Java serialized format. The highlighted row is especially interesting. That entry ("ACED" in "ASCII" character representation) looks like what is expected in all files serialized by Java's default serialization, but it's not quite correct.

The "Terminal Symbols and Constants" section of the Java Object Serialization Specification tells us that defines a constant STREAM_MAGIC that is the "Magic number that is written to the stream header." The specification further explains that ObjectStreamConstants.STREAM_MAGIC is defined as (short)0xaced and this can be verified in Java code if desired. The reason that particular entry led to an error is that it should be the hexadecimal representation that is "ACED" rather than the translated "ASCII" character representation. In other words, for that particular case, it was actually literal text "ACED" that was in the first bytes rather than bytes represented by the hexadecimal "ACED" representation.

There are many ways to translate the hexadecimal representation provided in the "StreamCorruptedException: invalid stream header" message to see if it translates to text that means something. If it is text, one knows that he or she is already off to a bad start as a binary serialized file should be used instead of text. The characters in that text can provide a further clue as to what type of text file was being accidentally provided. Here is one way to translate the provided hexadecimal representation to "ASCII" text using Java (available on GitHub):

private static String toAscii(final String hexInput)
   final int length = hexInput.length();
   final StringBuilder ascii = new StringBuilder();
   final StringBuilder integers = new StringBuilder();
   for (int i = 0; i < length; i+=2)
      final String twoDigitHex = hexInput.substring(i, i+2);
      final int integer = Integer.parseInt(twoDigitHex, 16);
      integers.append(String.format("%03d", integer)).append(" ");
   return hexInput + " ==> " + integers.deleteCharAt(integers.length()-1).toString() + " ==> " + ascii.toString();

Streams of text inadvertently passed to ObjectInputStream's constructor are not the only cause of "StreamCorruptedException: invalid stream header". In fact, any InputStream (text or binary) that doesn't begin with the expected "stream magic" bytes (0xaced) will lead to this exception.

Wednesday, January 2, 2019

Restarting Java's Raw String Literals Discussion

It was announced in December 2018 that raw string literals would be dropped from JDK 12. Now, in the new year, discussion related to the design of raw string literals in Java has begun again.

In the post "Raw string literals -- restarting the discussion" on the amber-spec-experts OpenJDK mailing list, Brian Goetz references the explanation for dropping raw string literals preview feature from JDK 12 and suggests "restart[ing] the design discussion." Goetz summarizes the previous design discussions and decisions and lessons learned from the first take on raw string literals, discusses some design questions and trade-offs to be made, and then calls for input on three specific types of observation data:

  • "Data that supports or refutes the claim that our primary use cases are embedded JSON, HTML, XML, and SQL."
  • "Use cases we've left out..."
  • "Data (either Java or non-Java) on the use of various flavors of strings (raw, multi-line, etc) in real codebases..."

Jim Laskey posted two messages with the title "Enhancing Java String Literals Round 2" to the same amber-spec-experts mailing list and references an HTML version and a PDF version of an "RTL2" document that aids in the discussion of "Take Two" of raw string literals. Laskey outlines a "series of critical decision points that should be given thought, if not answers, before we propose a new design."

A few of the major decisions to be made as raw string literals for Java are reconsidered include these discussed in the aforementioned posts are listed here, but many more are contained in the posts:

  • Which is really more important to developers: "raw text" or "multi-line strings"?
  • Which character makes for the best delimiter for most Java developers and Java use cases?
  • How should incidental spacing be handled?

There has already been some feedback on the amber-dev OpenJDK mailing list. Stephen Colebourne provides "Extended string literals feedback" and Bruno Borges recommends "special assignment rather [than] special delimiters."

I often see developers complaining about certain language and API decisions after the decisions have been implemented. For anyone with strong feelings about the subject of raw string literals and multi-line strings in Java, now is an opportunity to make one's voice heard and to possibly influence the final design that will come to Java at some point in the future. Discussion has also started on the Java subreddit in two threads: "Raw string literals -- restarting the discussion" and "New RSL Proposal".