Thursday, January 31, 2019

JDK 9/JEP 280: String Concatenations Will Never Be the Same

JEP 280 ("Indify String Concatenation") was implemented in conjunction with JDK 9 and, according to its "Summary" section, "Change[s] the static String-concatenation bytecode sequence generated by javac to use invokedynamic calls to JDK library functions." The impact this has on string concatenation in Java is most easily seen by looking at the javap output of classes using string concatenation that are compiled in pre-JDK 9 and post-JDK 9 JDKs.

The following simple Java class named "HelloWorldStringConcat" will be used for the first demonstration.

Contrasting of the differences in -verbose output from javap for the HelloWorldStringConcat class's main(String) method when compiled with JDK 8 (AdoptOpenJDK) and JDK 11 (Oracle OpenJDK) is shown next. I have highlighted some key differences.

JDK 8 javap Output

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringConcat.class
  Last modified Jan 28, 2019; size 625 bytes
  MD5 checksum 3e270bafc795b47dbc2d42a41c8956af
  Compiled from "HelloWorldStringConcat.java"
public class dustin.examples.HelloWorldStringConcat
  minor version: 0
  major version: 52
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=4, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #3                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuilder."":()V
        10: ldc           #5                  // String Hello,
        12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        15: aload_0
        16: iconst_0
        17: aaload
        18: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        21: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        27: return

JDK 11 javap Output

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringConcat.class
  Last modified Jan 28, 2019; size 908 bytes
  MD5 checksum 0e20fe09f6967ba96124abca10d3e36d
  Compiled from "HelloWorldStringConcat.java"
public class dustin.examples.HelloWorldStringConcat
  minor version: 0
  major version: 55
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: aload_0
         4: iconst_0
         5: aaload
         6: invokedynamic #3,  0              // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
        11: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        14: return

The "Description" section of JEP 280 describes this difference: "The idea is to replace the entire StringBuilder append dance with a simple invokedynamic call to java.lang.invoke.StringConcatFactory, that will accept the values in the need of concatenation." This same section shows a similar comparison of compiled output for a similar string concatenation example.

The compiled output from JDK 11 for the simple string concatenation is not just fewer lines than its JDK 8 counterpart; it also has fewer "expensive" operations. Potential performance improvement can be gained from not needing to wrap primitive types, and not needing to instantiate a bunch of extra objects. One of the primary motivations for this change was to "lay the groundwork for building optimized String concatenation handlers, implementable without the need to change the Java-to-bytecode compiler" and to "enable future optimizations of String concatenation without requiring further changes to the bytecode emitted by javac."

JEP 280 Does Not Affect StringBuilder or StringBuffer

There's an interesting implication of this in terms of using StringBuffer (which I have a difficult time finding a good use for anyway) and StringBuilder. It was a stated "Non-Goal" of JEP 280 to not "introduce any new String and/or StringBuilder APIs that might help to build better translation strategies." Related to this, for simple string concatenations like that shown in the code example at the beginning of this post, explicit use of StringBuilder and StringBuffer will actually preclude the ability for the compiler to make use of the JEP 280-introduced feature discussed in this post.

The next two code listings show similar implementations to the simple application shown above, but these use StringBuilder and StringBuffer respectively instead of string concatenation. When javap -verbose is executed against these classes after they are compiled with JDK 8 and with JDK 11, there are no significant differences in the main(String[]) methods.

Explicit StringBuilder Use in JDK 8 and JDK 11 Are The Same

JDK 8 javap Output for HelloWorldStringBuilder.main(String[])

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringBuilder.class
  Last modified Jan 28, 2019; size 627 bytes
  MD5 checksum e7acc3bf0ff5220ba5142aed7a34070f
  Compiled from "HelloWorldStringBuilder.java"
public class dustin.examples.HelloWorldStringBuilder
  minor version: 0
  major version: 52
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=4, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #3                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuilder."":()V
        10: ldc           #5                  // String Hello,
        12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        15: aload_0
        16: iconst_0
        17: aaload
        18: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        21: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        27: return

JDK 11 javap Output for HelloWorldStringBuilder.main(String[])

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringBuilder.class
  Last modified Jan 28, 2019; size 627 bytes
  MD5 checksum d04ee3735ce98eb6237885fac86620b4
  Compiled from "HelloWorldStringBuilder.java"
public class dustin.examples.HelloWorldStringBuilder
  minor version: 0
  major version: 55
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=4, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #3                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuilder."":()V
        10: ldc           #5                  // String Hello,
        12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        15: aload_0
        16: iconst_0
        17: aaload
        18: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        21: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        27: return

Explicit StringBuffer Use in JDK 8 and JDK 11 Are The Same

JDK 8 javap Output for HelloWorldStringBuffer.main(String[])

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringBuffer.class
  Last modified Jan 28, 2019; size 623 bytes
  MD5 checksum fdfb90497db6a3494289f2866b9a3a8b
  Compiled from "HelloWorldStringBuffer.java"
public class dustin.examples.HelloWorldStringBuffer
  minor version: 0
  major version: 52
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=4, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #3                  // class java/lang/StringBuffer
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuffer."":()V
        10: ldc           #5                  // String Hello,
        12: invokevirtual #6                  // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
        15: aload_0
        16: iconst_0
        17: aaload
        18: invokevirtual #6                  // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
        21: invokevirtual #7                  // Method java/lang/StringBuffer.toString:()Ljava/lang/String;
        24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        27: return

JDK 11 javap Output for HelloWorldStringBuffer.main(String[])

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringBuffer.class
  Last modified Jan 28, 2019; size 623 bytes
  MD5 checksum e4a83b6bb799fd5478a65bc43e9af437
  Compiled from "HelloWorldStringBuffer.java"
public class dustin.examples.HelloWorldStringBuffer
  minor version: 0
  major version: 55
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=4, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #3                  // class java/lang/StringBuffer
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuffer."":()V
        10: ldc           #5                  // String Hello,
        12: invokevirtual #6                  // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
        15: aload_0
        16: iconst_0
        17: aaload
        18: invokevirtual #6                  // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
        21: invokevirtual #7                  // Method java/lang/StringBuffer.toString:()Ljava/lang/String;
        24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        27: return

JDK 8 and JDK 11 Handling of Looped String Concatenation

For my last example of JEP 280 changes in action, I use a code sample that may offend some Java developers' sensibilities and perform a string concatenation within a loop. Keep in mind this is just an illustrative example and all will be okay, but don't try this at home.

JDK 8 javap Output for HelloWorldStringConcatComplex.main(String[])

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringConcatComplex.class
  Last modified Jan 30, 2019; size 766 bytes
  MD5 checksum 772c4a283c812d49451b5b756aef55f1
  Compiled from "HelloWorldStringConcatComplex.java"
public class dustin.examples.HelloWorldStringConcatComplex
  minor version: 0
  major version: 52
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=3, args_size=1
         0: ldc           #2                  // String Hello
         2: astore_1
         3: iconst_0
         4: istore_2
         5: iload_2
         6: bipush        25
         8: if_icmpge     36
        11: new           #3                  // class java/lang/StringBuilder
        14: dup
        15: invokespecial #4                  // Method java/lang/StringBuilder."":()V
        18: aload_1
        19: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        22: iload_2
        23: invokevirtual #6                  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
        26: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        29: astore_1
        30: iinc          2, 1
        33: goto          5
        36: getstatic     #8                  // Field java/lang/System.out:Ljava/io/PrintStream;
        39: aload_1
        40: invokevirtual #9                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        43: return

JDK 11 javap Output for HelloWorldStringConcatComplex.main(String[])

Classfile /C:/java/examples/helloWorld/classes/dustin/examples/HelloWorldStringConcatComplex.class
  Last modified Jan 30, 2019; size 1018 bytes
  MD5 checksum 967fef3e7625965ef060a831edb2a874
  Compiled from "HelloWorldStringConcatComplex.java"
public class dustin.examples.HelloWorldStringConcatComplex
  minor version: 0
  major version: 55
     . . .
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=3, args_size=1
         0: ldc           #2                  // String Hello
         2: astore_1
         3: iconst_0
         4: istore_2
         5: iload_2
         6: bipush        25
         8: if_icmpge     25
        11: aload_1
        12: iload_2
        13: invokedynamic #3,  0              // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;I)Ljava/lang/String;
        18: astore_1
        19: iinc          2, 1
        22: goto          5
        25: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
        28: aload_1
        29: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        32: return

In the presentation "Enough java.lang.String to Hang Ourselves ...," Dr. Heinz M. Kabutz and Dmitry Vyazelenko discuss the JEP 280-introduced changes to Java string concatenation and summarize it succinctly, "+ is no longer compiled to StringBuilder." In their "Lessons from Today" slide, they state, "Use + instead of StringBuilder where possible" and "recompile classes for Java 9+."

The changes implemented in JDK 9 for JEP 280 "will enable future optimizations of String concatenation without requiring further changes to the bytecode emitted by javac." Interestingly, it was recently announced that JEP 348 ("Java Compiler Intrinsics for JDK APIs") is now a Candidate JEP and it aims to use a similar approach for compiling methods String::format and Objects::hash.

Monday, January 28, 2019

Custom Compact Number Patterns with JDK 12 Compact Number Formatting

The post "Compact Number Formatting Comes to JDK 12" has been the subject of discussion on a java subreddit thread. Concerns expressed in that thread related to presentation of the compact number formatting deal with digits of precision displayed and the compact number patterns displayed. The digits of precision can be addressed via use of CompactNumberFormat.setMinimumFractionDigits(int) and this approach is discussed in more detail in the post "Using Minimum Fractional Digits with JDK 12 Compact Number Formatting." The second issue (dislike for the compact patterns used in the pre-built CompactNumberFormat instances for certain languages) is addressed in this post.

As far as I can determine (and I certainly could be missing something), there's no method on CompactNumberFormat that allows for the compact number patterns to be set on an existing instance of CompactNumberFormat. However, if the constructor for CompactNumberFormat is used to obtain an instance of this class (rather than using one of the overloaded static factory methods on NumberFormat), then the compact number patterns can be supplied via that constructor to the new instance of CompactNumberFormat. This is demonstrated in the next code listing (also available on GitHub).

/**
 * Provides an instance of {@code CompactNumberFormat} that has been
 * custom created via that class's constructor and represents an
 * alternate Germany German representation to that provided by an
 * instance of {@code CompactNumberFormat} obtained via the static
 * factory methods of {@code NumberFormat} for {@code Locale.GERMANY}.
 *
 * @return Instance of {@code CompactNumberFormat} with customized
 *    alternate German compact pattern representations.
 */
private static CompactNumberFormat generateCustomizedGermanCompactNumberFormat()
{
   final String[] germanyGermanCompactPatterns
      = {"", "", "", "0k", "00k", "000k", "0m", "00m", "000m", "0b", "00b", "000b", "0t", "00t", "000t"};
   final DecimalFormat germanyGermanDecimalFormat
      = acquireDecimalFormat(Locale.GERMANY);
   final CompactNumberFormat customGermanCompactNumberFormat
      = new CompactNumberFormat(
         germanyGermanDecimalFormat.toPattern(),
         germanyGermanDecimalFormat.getDecimalFormatSymbols(),
         germanyGermanCompactPatterns);
   return customGermanCompactNumberFormat;
}

There are three items worth special emphasis in the above code listing:

  1. The CompactNumberFormat(String, DecimalFormatSymbols, String[]) constructor allows for an array of Strings to be passed to the instance to specify the compact number patterns.
  2. The nature of the String[] defining compact number patterns is defined in the Javadoc for CompactNumberFormat class, which states, "The compact number patterns are represented in a series of patterns where each pattern is used to format a range of numbers."
    • That same Javadoc provides a US locale based example that provides these values for the range 100 to 1014 and it was that example that I adapted here.
    • More or fewer than 15 patterns can be supplied, but the first supplied pattern always corresponds to 100. The Javadoc states, "There can be any number of patterns and they are strictly index based starting from the range 100."
    • For demonstration purposes here, I adapted the patterns from an observation on the subreddit thread referenced earlier. I don't know much about German, but the argument for SI-based suffixes made a lot of sense and this is merely for illustrative purposes anyway.
  3. In this example, I retrieved the other two arguments to the CompactNumberFormat constructor from a JDK-provided instance of DecimalFormat for Germany German locale (Locale.GERMANY). This ensured that the decimal pattern and decimal format symbols of my custom Germany German instance of CompactNumberFormat would be the same as those associated with the JDK-provided instance.

The code listing above showed a call to a method called acquireDecimalFormat(Locale) to get a JDK-provided instance of DecimalFormat for Locale.GERMANY. For completeness, that method is shown next.

/**
 * Provides an instance of {@code DecimalFormat} associated with
 * the provided instance of {@code Locale}.
 *
 * @param locale Locale for which an instance of {@code DecimalFormat}
 *    is desired.
 * @return Instance of {@code DecimalFormat} corresponding to the
 *    provided {@code Locale}.
 * @throws ClassCastException Thrown if I'm unable to acquire a
 *    {@code DecimalFormat} instance from the static factory method
 *    on class {@code NumberFormat} (the approach recommended in the
 *    class-level Javadoc for {@code DecimalFormat}).
 */
private static DecimalFormat acquireDecimalFormat(final Locale locale)
{
   final NumberFormat generalGermanyGermanFormat
      = NumberFormat.getInstance(locale);
   if (generalGermanyGermanFormat instanceof DecimalFormat)
   {
      return (DecimalFormat) generalGermanyGermanFormat;
   }
   throw new ClassCastException(
        "Unable to acquire DecimalFormat in recommended manner;"
      + " presented with NumberFormat type of '"
      + generalGermanyGermanFormat.getClass().getSimpleName()
      + "' instead.");
}

The code snippets shown above demonstrate how the compact number patterns can be customized for a given instance of CompactNumberFormat when the compact number patterns associated in instances of that class for a given Locale are not desirable. It'd be nice if there was a method on the CompactNumberFormat class to allow for overriding of some or all of the compact number patterns associated with an existing instance obtained via NumberFormat static factory class, but JDK 12 has entered rampdown phase 2.

JDK 13: What AggressiveOpts?

The Java VM flag -XX:+AggressiveOpts was deprecated in JDK 11 [see JDK-8199777 and JDK-8199778] "because its behavior is ill-defined." The "Problem" section of JDK-8199778 further explains (I added the emphasis):

AggressiveOpts has been used as a catch-all method of enabling various experimental performance features, mostly targeted to improve score on very specific benchmarks. Most things it affected has been removed or integrated over time, leaving the behavior of the flag ill-defined and prone to cause more issues than it'll solve. The only effect that the flag currently has is setting AutoBoxCacheMax = 20000 and BiasedLockingStartupDelay = 500. Both can be done manually by setting the corresponding flags on the command line.

According to the document "Java HotSpot VM Options," the -XX:+AggressiveOpts flag was added with J2SE 5 Update 6 to "turn on point performance compiler optimizations that are expected to be default in upcoming releases."

The article "Java's -XX:+AggressiveOpts: Can it slow you down?" examines the -XX:+AggressiveOpts VM flag in detail and looks at some benchmark comparisons. The article concludes, "By retaining legacy flags you make it less likely to get the benefits of newer, faster features in released JVMs."

A much older Kirk Pepperdine article "Poorly chosen Java HotSpot Garbage Collection Flags and how to fix them!" specifically calls out -XX:+AggressiveOpts as an example of a VM flag whose behavior is unknown. Pepperdine writes that recommendations for use of this flag have not changed since Java SE 5.

When the -XX:+AggressiveOpts flag is passed to the JDK 11 Java launcher, a warning is presented: "VM warning: Option AggressiveOpts was deprecated in version 11.0 and will likely be removed in a future release."

In JDK 12, -XX:+AggressiveOpts has been removed as advertised (JDK-8150552) and a warning was presented to anyone trying to use it in conjunction with the Java launcher. The next screen snapshot displays this warning message that states, "VM warning: Ignoring option AggressiveOpts; support was removed in 12.0" (from JDK 12 Early Access Build #29 [2019/1/24]).

In JDK 13 Early Access builds, the VM won't start if -XX:+AggressiveOpts is specified. This is shown in the next screen snapshot (JDK 13 Early Access Build #5 [2019/1/24]).

As the previous image shows, the VM fails to start in JDK 13 when the -XX:+AggressiveOpts flag is specified and it reports the error message, "Unrecognized VM option 'AggressiveOpts'."

The -XX:+AggressiveOpts flag was deprecated in JDK 11, is removed but only shows a warning when specified in JDK 12, and is removed and prevents the VM from starting when specified in JDK 13.

Saturday, January 26, 2019

Preview Features Questions and Answers from Java SE Expert Group Meeting Minutes

Iris Clark recently published the Java SE Expert Group Meeting Minutes for 14 January 2019 on the java-se-spec-experts OpenJDK mailing list. Clark's minutes are highly approachable and written in a conversational style. These notes provide insight into what members of the Java SE Expert Group consider current, near-term, and long-term topics for Java SE's future. They cover topics that include the proposal for hyphenated keywords (including whether to change the switch expression's current break to break-with.), a JDK 12 retrospective with focus on the preview feature process, JSR 388 (Java SE 13) startup, and concerns about API documentation licenses.

Perhaps most interesting to me from Clark's 14 January 2019 EG Meeting Minutes is the summary of the discussion on the introduction and subsequent dropping of the raw string literals feature specifically and the preview feature process more generally. According to the minutes, Simon Ritter "noted that removing the language feature from the release was evidence that preview features were working as they should." The minutes also state that Brian Goetz "believed that raw string literals were solidified prematurely and work on them continued," but that he expects "an updated proposal for Java SE 13."

The meeting minutes provide some additional interesting commentary on the preview features process:

  • How much time should be provided for preview feature discussion?
    • "Brian summarized that while the preview feature concept was good, the new release cadence may require adjustments for these feature to receive a reasonable amount of feedback."
    • Duration of "six months" is NOT "defined in the JEP describing preview features."
  • "Volker ... believed that enterprise users would not move to SE 12."
  • "Volker observed that the huge code bases of enterprise developers made it unlikely that they would ever compile with a preview feature enabled."
  • "Preview features are not permanent. Explicit action must be taken for them to remain in the next release. The choices are to become a permanent feature or to preview for another release, just like [incubating modules]."
  • Does "lack of feedback [on a preview feature] indicate ... nobody was using the feature or it worked well?"
    • Should a preview feature be used in the JDK itself to improve likelihood of use of feature and accompanying feedback?
    • Any release (including a Long-Term Support [LTS] release) can include a preview feature.

I found the 14 January Meeting Notes of the Java SE Expert Group to be an interesting read. It sounds like the preview feature approach is generally working and will continue to be used to introduce features for preview before committing them to the specification.

Monday, January 21, 2019

Running JAXB xjc Compiler with OpenJDK 11

As described in the post "APIs To Be Removed from Java 11," a JAXB implementation is no longer included with JDK 11. In this post, I look at using the xjc compiler provided with the JAXB (Java Architecture for XML Binding) reference implementation in conjunction with OpenJDK 11 to compile XML schema files into Java classes.

Prior to Java SE 6, developers wanting to use JAXB with a Java SE application needed to acquire a JAXB implementation separately because one was not provided with the Java distribution. A JAXB implementation was included with Java starting with Java SE 6. This was convenient in many cases, but made things a bit more difficult when developers wished to use a newer version or different implementation of JAXB than the one provided with the JDK. When modularity was introduced with OpenJDK 9, the JAXB implementation was moved into the java.xml.bind module and was marked as deprecated for removal. The JAXB implementation was removed altogether with JDK 11. This post looks into using JAXB's xjc compiler with OpenJDK 11.

Because JDK 11 no longer includes an implementation of JAXB, one must be acquired separately. For this post, I will be using version 2.3.0 of the JAXB reference implementation. The JDK version used in this post is JDK 11.0.2 General-Availability Release.

Running the xjc scripts without arguments leads to help/usage being rendered to standard output.

Usage: xjc [-options ...] <schema file/URL/dir/jar> ... [-b <bindinfo>] ...
If dir is specified, all schema files in it will be compiled.
If jar is specified, /META-INF/sun-jaxb.episode binding file will be compiled.
Options:
  -nv                :  do not perform strict validation of the input schema(s)
  -extension         :  allow vendor extensions - do not strictly follow the
                        Compatibility Rules and App E.2 from the JAXB Spec
  -b <file/dir>      :  specify external bindings files (each <file> must have its own -b)
                        If a directory is given, **/*.xjb is searched
  -d <dir>           :  generated files will go into this directory
  -p <pkg>           :  specifies the target package
  -m <name>          :  generate module-info.java with given Java module name
  -httpproxy <proxy> :  set HTTP/HTTPS proxy. Format is [user[:password]@]proxyHost:proxyPort
  -httpproxyfile <f> :  Works like -httpproxy but takes the argument in a file to protect password 
  -classpath <arg>   :  specify where to find user class files
  -catalog <file>    :  specify catalog files to resolve external entity references
                        support TR9401, XCatalog, and OASIS XML Catalog format.
  -readOnly          :  generated files will be in read-only mode
  -npa               :  suppress generation of package level annotations (**/package-info.java)
  -no-header         :  suppress generation of a file header with timestamp
  -target (2.0|2.1)  :  behave like XJC 2.0 or 2.1 and generate code that doesnt use any 2.2 features.
  -encoding <encoding> :  specify character encoding for generated source files
  -enableIntrospection :  enable correct generation of Boolean getters/setters to enable Bean Introspection apis 
  -disableXmlSecurity  :  disables XML security features when parsing XML documents 
  -contentForWildcard  :  generates content property for types with multiple xs:any derived elements 
  -xmlschema         :  treat input as W3C XML Schema (default)
  -dtd               :  treat input as XML DTD (experimental,unsupported)
  -wsdl              :  treat input as WSDL and compile schemas inside it (experimental,unsupported)
  -verbose           :  be extra verbose
  -quiet             :  suppress compiler output
  -help              :  display this help message
  -version           :  display version information
  -fullversion       :  display full version information


Extensions:
  -Xinject-code      :  inject specified Java code fragments into the generated code
  -Xlocator          :  enable source location support for generated code
  -Xsync-methods     :  generate accessor methods with the 'synchronized' keyword
  -mark-generated    :  mark the generated code as @javax.annotation.Generated
  -episode <FILE>    :  generate the episode file for separate compilation
  -Xpropertyaccessors :  Use XmlAccessType PROPERTY instead of FIELD for generated classes

The xjc compiler scripts (bash file and DOS batch file) are conveniences for invoking the jaxb-xjc.jar. The scripts invoke it as an executable JAR (java -jar) as shown in the following excerpts:

  • Windows version (xjc.bat):
    %JAVA% %XJC_OPTS% -jar "%JAXB_HOME%\lib\jaxb-xjc.jar" %*
  • Linux version (xjc.sh):
    exec "$JAVA" $XJC_OPTS -jar "$JAXB_HOME/lib/jaxb-xjc.jar" "$@"

As the script excerpts above show, an environmental variable XJC_OPTS is included in the invocation of the Java launcher. Unfortunately, the JAXB reference implementation JAR cannot be simply added to the classpath via -classpath because running excutable JARs with java -jar only respects the classpath designated within the executable JAR via the MANIFEST.MF's Class-Path (which entry exists in the jaxb-ri-2.3.0.jar as "Class-Path: jaxb-core.jar jaxb-impl.jar").

One way to approach this is to modify the script to use the JAR as a regular JAR (without -jar) and explicitly execute the class XJCFacade, so that the classpath can be explicitly provided to the Java launcher. This is demonstrated for the Windows xjc.bat script:

%JAVA% -cp C:\lib\javax.activation-api-1.2.0.jar;C:\jaxb-ri-2.3.0\lib\jaxb-xjc.jar com.sun.tools.xjc.XJCFacade %*

In addition to the JAXB reference implementation JAR jaxb-xjc.jar, I also needed to include the javax.activation-api-1.2.0.jar JAR on the classpath because the JavaBeans Application Framework (JAF) is a dependency that is also no longer delivered with the JDK (removed via same JEP 320 that removed JAXB).

It's also possible, of course, to not use the XJC scripts at all and to run the Java launcher directly. The script ensures that environment variable JAXB_HOME is set. This environment variable should point to the directory into which the JAXB reference implementation was expanded.

With these changes, the JAXB xjc compiler can be executed against an XSD on the command line using JDK 11.

Thursday, January 17, 2019

Using Minimum Fractional Digits with JDK 12 Compact Number Formatting

The post "Compact Number Formatting Comes to JDK 12" demonstrated the support added to NumberFormat in JDK 12 to support compact number formatting. The examples shown in that post only used the instances of NumberFormat returned by invocations of NumberFormat's new overloaded getCompactNumberInstance(-) methods and so therefore did not specify characteristics such as minimum fractional digits and maximum fractional digits. The results, in some cases, are less than desirable. Fortunately, NumberFormat does allow for minimum and maximum fractional digits to be specified and this post demonstrates how that can improve the output of the compact number formatting available with JDK 12.

The Javadoc documentation for new class java.text.CompactNumberFormat (currently available at https://download.java.net/java/early_access/jdk12/docs/api/java.base/java/text/CompactNumberFormat.html, but this location will change when JDK 12 is officially released in March) includes a section "Formatting" that states: "The default formatting behavior returns a formatted string with no fractional digits, however users can use the setMinimumFractionDigits(int) method to include the fractional part."

The code listing introduced in the original "Compact Number Formatting Comes to JDK 12" post (and which is available on GitHub) has been updated to demonstrate use of NumberFormat.setMinimumFractionDigits(int). An excerpt of that code is shown next and is followed by the accompanying output.

/**
 * Generates standardized map of labels to Compact Number Format
 * instances described by the labels. The instances of {@code NumberFormat}
 * are created with Locale and Style only and with the provided number
 * of minimum fractional digits.
 *
 * @return Mapping of label to an instance of a Compact Number Format
 *    consisting of a Locale, Style, and specified minimum number of fractional
 *    digits that is described by the label.
 */
private static Map<String, NumberFormat> generateCompactNumberFormats(
   final int minimumNumberFractionDigits)
{
   var numberFormats = generateCompactNumberFormats();
   numberFormats.forEach((label, numberFormat) ->
      numberFormat.setMinimumFractionDigits(minimumNumberFractionDigits));
   return numberFormats;
}


/**
 * Demonstrates compact number formatting in a variety of locales
 * and number formats against the provided {@code long} value and
 * with a minimum fractional digits of 1 specified.
 * @param numberToFormat Value of type {@code long} that is to be
 *    formatted using compact number formatting and a variety of
 *    locales and number formats and with a single minimal fractional
 *    digit.
 */
private static void demonstrateCompactNumberFormattingOneFractionalDigitMinimum(
   final long numberToFormat)
{
   final Map<String, NumberFormat> numberFormats = generateCompactNumberFormats(1);
   out.println(
      "Demonstrating Compact Number Formatting on long '" + numberToFormat
         + "' with 1 minimum fraction digit:");
   numberFormats.forEach((label, numberFormat) ->
      out.println("\t" +  label + ": " + numberFormat.format(numberToFormat))
   );
}
Demonstrating Compact Number Formatting on long '15' with 1 minimum fraction digit:
 Default: 15
 US/Long: 15
 UK/Short: 15
 UK/Long: 15
 FR/Short: 15
 FR/Long: 15
 DE/Short: 15
 DE/Long: 15
 IT/Short: 15
 IT/Long: 15
Demonstrating Compact Number Formatting on long '150' with 1 minimum fraction digit:
 Default: 150
 US/Long: 150
 UK/Short: 150
 UK/Long: 150
 FR/Short: 150
 FR/Long: 150
 DE/Short: 150
 DE/Long: 150
 IT/Short: 150
 IT/Long: 150
Demonstrating Compact Number Formatting on long '1500' with 1 minimum fraction digit:
 Default: 1.5K
 US/Long: 1.5 thousand
 UK/Short: 1.5K
 UK/Long: 1.5 thousand
 FR/Short: 1,5 k
 FR/Long: 1,5 millier
 DE/Short: 1.500
 DE/Long: 1,5 Tausend
 IT/Short: 1.500
 IT/Long: 1,5 mille
Demonstrating Compact Number Formatting on long '15000' with 1 minimum fraction digit:
 Default: 15.0K
 US/Long: 15.0 thousand
 UK/Short: 15.0K
 UK/Long: 15.0 thousand
 FR/Short: 15,0 k
 FR/Long: 15,0 mille
 DE/Short: 15.000
 DE/Long: 15,0 Tausend
 IT/Short: 15.000
 IT/Long: 15,0 mila
Demonstrating Compact Number Formatting on long '150000' with 1 minimum fraction digit:
 Default: 150.0K
 US/Long: 150.0 thousand
 UK/Short: 150.0K
 UK/Long: 150.0 thousand
 FR/Short: 150,0 k
 FR/Long: 150,0 mille
 DE/Short: 150.000
 DE/Long: 150,0 Tausend
 IT/Short: 150.000
 IT/Long: 150,0 mila
Demonstrating Compact Number Formatting on long '1500000' with 1 minimum fraction digit:
 Default: 1.5M
 US/Long: 1.5 million
 UK/Short: 1.5M
 UK/Long: 1.5 million
 FR/Short: 1,5 M
 FR/Long: 1,5 million
 DE/Short: 1,5 Mio.
 DE/Long: 1,5 Million
 IT/Short: 1,5 Mln
 IT/Long: 1,5 milione
Demonstrating Compact Number Formatting on long '15000000' with 1 minimum fraction digit:
 Default: 15.0M
 US/Long: 15.0 million
 UK/Short: 15.0M
 UK/Long: 15.0 million
 FR/Short: 15,0 M
 FR/Long: 15,0 million
 DE/Short: 15,0 Mio.
 DE/Long: 15,0 Millionen
 IT/Short: 15,0 Mln
 IT/Long: 15,0 milioni

As the example and output shown above demonstrate, use of NumberFormat.setMinimumFractionDigits(int) leads to compact number formatted output that is likely to be more aesthetically pleasing in many cases. There is a recent discussion "Compact Number Formatting and Fraction Digits" on the OpenJDK core-libs-dev mailing list that also discusses this ability to customize the compact number formatting output.

Wednesday, January 9, 2019

Proposal for Wider Range of Available Java Keywords

In the recent posts titled "We need more keywords, captain!" on the OpenJDK amber-spec-experts mailing list, Brian Goetz "proposes a possible move that will buy us some breathing room in the perpetual problem where the keyword-management tail wags the programming-model dog." His proposal is to "allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers."

Goetz points out in the original post that Section 3.9 ("Keywords") of the Java Language Specification spells out the current keywords in Java and that these chosen keywords have been stable since Java's inception with the only changes being the addition of the highlighted keywords shown below (assert in JDK 1.4, enum in JDK 1.5, and _ in JDK 1.9):

abstract   continue   for          new         switch
assert     default    if           package     synchronized
boolean    do         goto         private     this
break      double     implements   protected   throw
byte       else       import       public      throws
case       enum       instanceof   return      transient
catch      extends    int          short       try
char       final      interface    static      void
class      finally    long         strictfp    volatile
const      float      native       super       while
_ (underscore)

Goetz's post writes about "several tools at our disposal" that have been or could be used when the set of pre-established keywords are not "suitable for expressing all the things we might ever want our language to express." Goetz emphasizes this point, "The lack of reasonable options for extending the syntax of the language threatens to become a significant impediment to language evolution."

Goetz provides significant background explanation regarding the downsides of the "several tools" he had mentioned earlier in the post and states, "We need a new source of keyword candidates." Goetz then proposes to "allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers."

The Goetz post provides several examples to illustrate hyphenated keywords and how they might be used. Goetz emphasizes that this list is not in any way proposing these specific keywords for now, but is only providing them as illustrative examples. See the original post for brief descriptions of how some of these illustrative examples might be used.

  • non-null
  • non-final
  • package-private
  • public-read
  • null-checked
  • type-static
  • default-value
  • eventually-final
  • semi-final
  • exhaustive-switch
  • enum-class
  • annotation-class
  • record-class
  • this-class
  • this-return

Two other posts on the mailing list have already built on this original message. In a reply to his own message, Goetz writes that the expression switch could have (or still might be able to because switch expressions are currently a Preview Feature) support a keyword such as break-with rather than requiring developers to "disambiguate whether [a given break is a] labeled break or a value break." Guy Steele replied that he prefers break-return to break-with and that this particular single illustrative example "has [maybe] won me over to the idea of hyphenated keywords."

The Goetz post "We need more keywords, captain!" has seen some attention on Twitter as well. Sander Mak tweeted, "Another example of real-world programming language design by @BrianGoetz: http://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-January/000945.html ... Love the diligence and forward-looking nature of these #Java language design discussions." Bruno Borges tweeted a quote from that same post: "We are convinced that @Java has a long life ahead of it, and developers are excited about new features that enable to them to write more expressive and reliable code."

The hyphenated keyword approach seems promising. Regardless of what happens, the Goetz post makes for an interesting read regarding the difficulty of adding or reapplying keywords to a long-lived programming language concerned about backwards compatibility. The post is also offers a peek into the types of issues, constraints, and trade-offs language designers must make.

Tuesday, January 8, 2019

Explicitly Naming Automatic Java Modules

Nicolas Fränkel recently published the surprising post "A hard look at the state of Java modularization." In that post, Fränkel provides the results of his investigation into support available in the 29 libraries referenced in the blog post "20 popular Java libraries" for modules introduced with JDK 9. Fränkel's investigation aimed to identify which of these popular Java libraries was "modularized" (fully implemented module defined in module-info.java) or provided at least an "automatic module name" via MANIFEST.MF even if the library isn't modularized.

Of the 29 popular Java libraries investigated, Fränkel identified only two (SLF4J 1.8.0-beta2 and JAXB 2.3.1) that are fully modularized. Of the remaining 27 libraries that are not fully modularized, 12 do have an automatic module name defined. That means, of course, that 15 of the 29 libraries (just over 50%) do not have any explicit support for modularity introduced with JDK 9!

In the post "Automatic-Module-Name: Calling all Java Library Maintainers" a little over one year ago, Sander Mak (one of the authors of Java 9 Modularity) describes "what needs to be done to move the Java library ecosystem toward modules." Mak explains that "support for the Java module system can be incrementally added to libraries" and uses this post to "explain the first step ... to becoming a Java module." Mak writes:

This first step boils down to picking a module name, and adding it as Automatic-Module-Name: <module name> entry to the library's MANIFEST.MF. That's it. With this first step you make your library usable as Java module without moving the library itself to Java 9 or creating a module descriptor for the library, yet.

In the "Automatic-Module-Name: Calling all Java Library Maintainers" post, Mak also provides guidance for providing an automatic name for a module. He recommends picking an explicit name for the module rather than relying on the ModuleFinder-based name derivation algorithm (module named based on JAR filename). Mak references Stephen Colebourne's post "Java SE 9 - JPMS automatic modules," in which Colebourne concludes, "Community members must at all costs avoid publishing modular jar files that depend on filenames." Incidentally, Colebourne's post "Java SE 9 - JPMS module naming" provides additional guidance on module naming.

The naming of the automatic module is significant because later changes to that name will cause backwards incompatibilities for the library. It is also important to not have the module name collide with others' libraries' module names. The recommended way of doing this is to use the root package name contained within the module, assuming that package uses the typical Java package naming convention to ensure uniqueness.

Mak also outlines in his post some "potential issues you need to verify" before adding the Automatic-Module-Name entry to the MANIFEST.MF file to avoid "false expectations." See the "Sanity-Check Your Library" section of Mak's post for the full list and detailed description of these issues which include not using internal JDK types and not having any classes in the unnamed package.

Before concluding my post, I am going to briefly present the difference between an explicitly named automatic module and an implicitly named automatic module. For this, I'll be using a JAR file generated from the code example in my previous post "Parsing Value from StreamCorruptedException: invalid stream header Message." The Java source is not that important for my purposes here other than to point out that the main class used in my post has the package name dustin.utilities.io.

The generated JAR file, which I've called io-examples.jar is not modularized (does not have a module-info file). The jar tool's option --describe-module can be used to quickly determine the automatic module name of the JAR file.

The next two screen snapshots show the results of running jar with the --describe-module option against the JAR. The first screen snapshot indicates the results when the JAR has nothing in its MANIFEST.MF file to indicate an automatic module name. The result is a automatic module named after the JAR name. The second screen snapshot shows the results from running jar --describe-module against the almost identical JAR except for the addition of the attribute Automatic-Module-Name: dustin.utilities.io to the JAR's MANIFEST.MF file. In that case, the automatic module is named the explicitly provided name provided in the manifest file (dustin.utilities.io).

Fränkel opens his post with the assertion, "With the coming of Java 11, the latest Long-Term Support, I think it's a good time to take a snapshot of the state of modularization." I too think that recent and pending changes make it more important for all of us in the Java community to start understanding Java's built-in modularity and its implications and to take steps toward more complete modularity support.

Additional (Previously Referenced) Resources

Monday, January 7, 2019

The JDK 13 Train Has Left the Station

JDK 12 [Java SE 12 Platform (JSR 386)] is still in Rampdown Phase 1, but initial work on JDK 13 [Java SE 13 Platform (JSR 388)] has already begun. Draft 26 of the Java SE 12 specification was announced approximately 10 hours before Draft 2 of the Java SE 13 specification was announced.

There are already early access builds of JDK 13 available for Linux, macOS, Windows, and Alpine Linux. As of this writing, the current JDK 13 early access build is #2 (3 January 2019). The JDK 13 Early-Access Release Notes do not yet contain anything of significance.

There are ten change sets associated with Build 2 and another ten change sets associated with JDK 13 Early Access Build 1. There are 33 issues addressed with that build (688 total bugs associated with JDK 13 for "build fix" as this writing).

The temporary home of the Javadoc-based API documentation for JDK 13 is also available at https://download.java.net/java/early_access/jdk13/docs/api/.

The OpenJDK JDK 13 project page describes the project's status: "The development repositories are open for bug fixes, small enhancements, and JEPs as proposed and tracked via the JEP Process."

Thursday, January 3, 2019

Parsing Value from StreamCorruptedException: invalid stream header Message

It is a relatively common occurrence to see StreamCorruptedExceptions thrown with a "reason" that states, "invalid stream header" and then provides the first part of that invalid stream header. Frequently, a helpful clue for identifying the cause of that exception is to understand what the invalid stream header is because that explains what is unexpected and causing the issue.

The StreamCorruptedException only has two constructors, one that accepts no arguments and one that accepts a single String describing the exception's "reason". This tells us that the "invalid stream header: XXXXXXXX" messages (where XXXXXXXX represents various invalid header details) are provided by the code that instantiates (and presumably throws) these StreamCorruptedExceptions rather than by that exception class itself. This means that it won't always necessarily be the same formatted message encountered with one of these exceptions, but in most common cases, the format is the same with "invalid stream header: " followed by the first portion of that invalid stream header.

This exception is commonly thrown by an ObjectInputStream. The Javadoc for that class has some useful details that help explain why the "StreamCorruptedException: invalid stream header" is encountered. The class-level Javadoc states, "Only objects that support the java.io.Serializable or java.io.Externalizable interface can be read from streams." The Javadoc for the ObjectInputStream​(InputStream) constructor states (I added the emphasis), "Creates an ObjectInputStream that reads from the specified InputStream. A serialization stream header is read from the stream and verified."

As the quoted Javadoc explains, ObjectInputStream should be used with serialized data. Many of the cases of the "StreamCorruptedException: invalid stream header" message occur when a text file (such as HTML, XML, JSON, etc.) is passed to this constructor rather than a Java serialized file.

The following are examples of "ASCII" values derived from "invalid stream header" messages associated with StreamCorruptedExceptions and reported online.

Invalid Stream Header Value (HEX) Corresponding Integers Corresponding
"ASCII" Value
Online References / Examples
00000000 000 000 000 000 https://stackoverflow.com/questions/44479323/exception-in-thread-main-java-io-streamcorruptedexception-invalid-stream-head
0A0A0A0A 010 010 010 010

 

 

https://issues.jenkins-ci.org/browse/JENKINS-35197
0A0A3C68 010 010 060 104

<h

https://developer.ibm.com/answers/questions/201983/what-does-javaiostreamcorruptedexception-invalid-s/
20646520 032 100 101 032 de https://stackoverflow.com/questions/2622716/java-invalid-stream-header-problem
30313031 048 049 048 049 0101 https://stackoverflow.com/questions/48946230/java-io-streamcorruptedexception-invalid-stream-header-30313031
32303138 050 048 049 056 2018 https://stackoverflow.com/questions/49878481/jpa-invalid-stream-header-32303138
3C21444F 060 033 068 079 <!DO https://github.com/metasfresh/metasfresh/issues/1335
3c48544d 060 072 084 077 <HTM http://forum.spring.io/forum/spring-projects/integration/jms/70353-java-io-streamcorruptedexception-invalid-stream-header
3C6F626A 060 111 098 106 <obj  
3C787364 060 120 115 100 <xsd https://stackoverflow.com/questions/29769191/java-io-streamcorruptedexception-invalid-stream-header-3c787364
41434544 065 067 069 068 ACED https://stackoverflow.com/questions/36677022/java-io-streamcorruptedexception-invalid-stream-header-41434544
48656C6C 072 101 108 108 Hell https://stackoverflow.com/questions/28298366/java-io-streamcorruptedexception-invalid-stream-header-48656c6c
4920616D 073 032 097 109 I am https://stackoverflow.com/questions/34435188/java-io-streamcorruptedexception-invalid-stream-header-4920616d
54656D70 084 101 109 112 Temp https://stackoverflow.com/a/50669243
54657374 084 101 115 116 Test java.io.StreamCorruptedException: invalid stream header: 54657374
54686973 084 104 105 115 This https://stackoverflow.com/questions/28354180/stanford-corenlp-streamcorruptedexception-invalid-stream-header-54686973
64617364 100 097 115 100 dasd https://stackoverflow.com/questions/50451100/java-io-streamcorruptedexception-invalid-stream-header-when-writing-to-the-stdo?noredirect=1&lq=1
70707070 112 112 112 112 pppp https://stackoverflow.com/questions/32858472/java-io-streamcorruptedexception-invalid-stream-header-70707070
72657175 114 101 113 117 requ https://stackoverflow.com/questions/8534124/java-io-streamcorruptedexception-invalid-stream-header-72657175
7371007E 115 113 000 126 sq ~ https://stackoverflow.com/questions/2939073/java-io-streamcorruptedexception-invalid-stream-header-7371007e
77617161 119 097 113 097 waqa https://coderanch.com/t/278717/java/StreamCorruptedException-invalid-stream-header
7B227061 123 034 112 097 {"pa https://stackoverflow.com/questions/9986672/streamcorruptedexception-invalid-stream-header

The above examples show the "StreamCorruptedException: invalid stream header" message occurring for cases where input streams representing text were passed to the constructor that expects Java serialized format. The highlighted row is especially interesting. That entry ("ACED" in "ASCII" character representation) looks like what is expected in all files serialized by Java's default serialization, but it's not quite correct.

The "Terminal Symbols and Constants" section of the Java Object Serialization Specification tells us that java.io.ObjectStreamConstants defines a constant STREAM_MAGIC that is the "Magic number that is written to the stream header." The specification further explains that ObjectStreamConstants.STREAM_MAGIC is defined as (short)0xaced and this can be verified in Java code if desired. The reason that particular entry led to an error is that it should be the hexadecimal representation that is "ACED" rather than the translated "ASCII" character representation. In other words, for that particular case, it was actually literal text "ACED" that was in the first bytes rather than bytes represented by the hexadecimal "ACED" representation.

There are many ways to translate the hexadecimal representation provided in the "StreamCorruptedException: invalid stream header" message to see if it translates to text that means something. If it is text, one knows that he or she is already off to a bad start as a binary serialized file should be used instead of text. The characters in that text can provide a further clue as to what type of text file was being accidentally provided. Here is one way to translate the provided hexadecimal representation to "ASCII" text using Java (available on GitHub):

private static String toAscii(final String hexInput)
{
   final int length = hexInput.length();
   final StringBuilder ascii = new StringBuilder();
   final StringBuilder integers = new StringBuilder();
   for (int i = 0; i < length; i+=2)
   {
      final String twoDigitHex = hexInput.substring(i, i+2);
      final int integer = Integer.parseInt(twoDigitHex, 16);
      ascii.append((char)integer);
      integers.append(String.format("%03d", integer)).append(" ");
   }
   return hexInput + " ==> " + integers.deleteCharAt(integers.length()-1).toString() + " ==> " + ascii.toString();
}

Streams of text inadvertently passed to ObjectInputStream's constructor are not the only cause of "StreamCorruptedException: invalid stream header". In fact, any InputStream (text or binary) that doesn't begin with the expected "stream magic" bytes (0xaced) will lead to this exception.

Wednesday, January 2, 2019

Restarting Java's Raw String Literals Discussion

It was announced in December 2018 that raw string literals would be dropped from JDK 12. Now, in the new year, discussion related to the design of raw string literals in Java has begun again.

In the post "Raw string literals -- restarting the discussion" on the amber-spec-experts OpenJDK mailing list, Brian Goetz references the explanation for dropping raw string literals preview feature from JDK 12 and suggests "restart[ing] the design discussion." Goetz summarizes the previous design discussions and decisions and lessons learned from the first take on raw string literals, discusses some design questions and trade-offs to be made, and then calls for input on three specific types of observation data:

  • "Data that supports or refutes the claim that our primary use cases are embedded JSON, HTML, XML, and SQL."
  • "Use cases we've left out..."
  • "Data (either Java or non-Java) on the use of various flavors of strings (raw, multi-line, etc) in real codebases..."

Jim Laskey posted two messages with the title "Enhancing Java String Literals Round 2" to the same amber-spec-experts mailing list and references an HTML version and a PDF version of an "RTL2" document that aids in the discussion of "Take Two" of raw string literals. Laskey outlines a "series of critical decision points that should be given thought, if not answers, before we propose a new design."

A few of the major decisions to be made as raw string literals for Java are reconsidered include these discussed in the aforementioned posts are listed here, but many more are contained in the posts:

  • Which is really more important to developers: "raw text" or "multi-line strings"?
  • Which character makes for the best delimiter for most Java developers and Java use cases?
  • How should incidental spacing be handled?

There has already been some feedback on the amber-dev OpenJDK mailing list. Stephen Colebourne provides "Extended string literals feedback" and Bruno Borges recommends "special assignment rather [than] special delimiters."

I often see developers complaining about certain language and API decisions after the decisions have been implemented. For anyone with strong feelings about the subject of raw string literals and multi-line strings in Java, now is an opportunity to make one's voice heard and to possibly influence the final design that will come to Java at some point in the future. Discussion has also started on the Java subreddit in two threads: "Raw string literals -- restarting the discussion" and "New RSL Proposal".