Monday, January 21, 2019

Running JAXB xjc Compiler with OpenJDK 11

As described in the post "APIs To Be Removed from Java 11," a JAXB implementation is no longer included with JDK 11. In this post, I look at using the xjc compiler provided with the JAXB (Java Architecture for XML Binding) reference implementation in conjunction with OpenJDK 11 to compile XML schema files into Java classes.

Prior to Java SE 6, developers wanting to use JAXB with a Java SE application needed to acquire a JAXB implementation separately because one was not provided with the Java distribution. A JAXB implementation was included with Java starting with Java SE 6. This was convenient in many cases, but made things a bit more difficult when developers wished to use a newer version or different implementation of JAXB than the one provided with the JDK. When modularity was introduced with OpenJDK 9, the JAXB implementation was moved into the java.xml.bind module and was marked as deprecated for removal. The JAXB implementation was removed altogether with JDK 11. This post looks into using JAXB's xjc compiler with OpenJDK 11.

Because JDK 11 no longer includes an implementation of JAXB, one must be acquired separately. For this post, I will be using version 2.3.0 of the JAXB reference implementation. The JDK version used in this post is JDK 11.0.2 General-Availability Release.

Running the xjc scripts without arguments leads to help/usage being rendered to standard output.

Usage: xjc [-options ...] <schema file/URL/dir/jar> ... [-b <bindinfo>] ...
If dir is specified, all schema files in it will be compiled.
If jar is specified, /META-INF/sun-jaxb.episode binding file will be compiled.
Options:
  -nv                :  do not perform strict validation of the input schema(s)
  -extension         :  allow vendor extensions - do not strictly follow the
                        Compatibility Rules and App E.2 from the JAXB Spec
  -b <file/dir>      :  specify external bindings files (each <file> must have its own -b)
                        If a directory is given, **/*.xjb is searched
  -d <dir>           :  generated files will go into this directory
  -p <pkg>           :  specifies the target package
  -m <name>          :  generate module-info.java with given Java module name
  -httpproxy <proxy> :  set HTTP/HTTPS proxy. Format is [user[:password]@]proxyHost:proxyPort
  -httpproxyfile <f> :  Works like -httpproxy but takes the argument in a file to protect password 
  -classpath <arg>   :  specify where to find user class files
  -catalog <file>    :  specify catalog files to resolve external entity references
                        support TR9401, XCatalog, and OASIS XML Catalog format.
  -readOnly          :  generated files will be in read-only mode
  -npa               :  suppress generation of package level annotations (**/package-info.java)
  -no-header         :  suppress generation of a file header with timestamp
  -target (2.0|2.1)  :  behave like XJC 2.0 or 2.1 and generate code that doesnt use any 2.2 features.
  -encoding <encoding> :  specify character encoding for generated source files
  -enableIntrospection :  enable correct generation of Boolean getters/setters to enable Bean Introspection apis 
  -disableXmlSecurity  :  disables XML security features when parsing XML documents 
  -contentForWildcard  :  generates content property for types with multiple xs:any derived elements 
  -xmlschema         :  treat input as W3C XML Schema (default)
  -dtd               :  treat input as XML DTD (experimental,unsupported)
  -wsdl              :  treat input as WSDL and compile schemas inside it (experimental,unsupported)
  -verbose           :  be extra verbose
  -quiet             :  suppress compiler output
  -help              :  display this help message
  -version           :  display version information
  -fullversion       :  display full version information


Extensions:
  -Xinject-code      :  inject specified Java code fragments into the generated code
  -Xlocator          :  enable source location support for generated code
  -Xsync-methods     :  generate accessor methods with the 'synchronized' keyword
  -mark-generated    :  mark the generated code as @javax.annotation.Generated
  -episode <FILE>    :  generate the episode file for separate compilation
  -Xpropertyaccessors :  Use XmlAccessType PROPERTY instead of FIELD for generated classes

The xjc compiler scripts (bash file and DOS batch file) are conveniences for invoking the jaxb-xjc.jar. The scripts invoke it as an executable JAR (java -jar) as shown in the following excerpts:

  • Windows version (xjc.bat):
    %JAVA% %XJC_OPTS% -jar "%JAXB_HOME%\lib\jaxb-xjc.jar" %*
  • Linux version (xjc.sh):
    exec "$JAVA" $XJC_OPTS -jar "$JAXB_HOME/lib/jaxb-xjc.jar" "$@"

As the script excerpts above show, an environmental variable XJC_OPTS is included in the invocation of the Java launcher. Unfortunately, the JAXB reference implementation JAR cannot be simply added to the classpath via -classpath because running excutable JARs with java -jar only respects the classpath designated within the executable JAR via the MANIFEST.MF's Class-Path (which entry exists in the jaxb-ri-2.3.0.jar as "Class-Path: jaxb-core.jar jaxb-impl.jar").

One way to approach this is to modify the script to use the JAR as a regular JAR (without -jar) and explicitly execute the class XJCFacade, so that the classpath can be explicitly provided to the Java launcher. This is demonstrated for the Windows xjc.bat script:

%JAVA% -cp C:\lib\javax.activation-api-1.2.0.jar;C:\jaxb-ri-2.3.0\lib\jaxb-xjc.jar com.sun.tools.xjc.XJCFacade %*

In addition to the JAXB reference implementation JAR javax.activation-api-1.2.0.jar, I also needed to include the javax.activation-api-1.2.0.jar JAR on the classpath because the JavaBeans Application Framework (JAF) is a dependency that is also no longer delivered with the JDK (removed via same JEP 320 that removed JAXB).

It's also possible, of course, to not use the XJC scripts at all and to run the Java launcher directly. The script ensures that environment variable JAXB_HOME is set. This environment variable should point to the directory into which the JAXB reference implementation was expanded.

With these changes, the JAXB xjc compiler can be executed against an XSD on the command line using JDK 11.

Thursday, January 17, 2019

Using Minimum Fractional Digits with JDK 12 Compact Number Formatting

The post "Compact Number Formatting Comes to JDK 12" demonstrated the support added to NumberFormat in JDK 12 to support compact number formatting. The examples shown in that post only used the instances of NumberFormat returned by invocations of NumberFormat's new overloaded getCompactNumberInstance(-) methods and so therefore did not specify characteristics such as minimum fractional digits and maximum fractional digits. The results, in some cases, are less than desirable. Fortunately, NumberFormat does allow for minimum and maximum fractional digits to be specified and this post demonstrates how that can improve the output of the compact number formatting available with JDK 12.

The code listing introduced in the original "Compact Number Formatting Comes to JDK 12" post (and which is available on GitHub) has been updated to demonstrate use of NumberFormat.setMinimumFractionDigits(int). An excerpt of that code is shown next and is followed by the accompanying output.

/**
 * Generates standardized map of labels to Compact Number Format
 * instances described by the labels. The instances of {@code NumberFormat}
 * are created with Locale and Style only and with the provided number
 * of minimum fractional digits.
 *
 * @return Mapping of label to an instance of a Compact Number Format
 *    consisting of a Locale, Style, and specified minimum number of fractional
 *    digits that is described by the label.
 */
private static Map<String, NumberFormat> generateCompactNumberFormats(
   final int minimumNumberFractionDigits)
{
   var numberFormats = generateCompactNumberFormats();
   numberFormats.forEach((label, numberFormat) ->
      numberFormat.setMinimumFractionDigits(minimumNumberFractionDigits));
   return numberFormats;
}


/**
 * Demonstrates compact number formatting in a variety of locales
 * and number formats against the provided {@code long} value and
 * with a minimum fractional digits of 1 specified.
 * @param numberToFormat Value of type {@code long} that is to be
 *    formatted using compact number formatting and a variety of
 *    locales and number formats and with a single minimal fractional
 *    digit.
 */
private static void demonstrateCompactNumberFormattingOneFractionalDigitMinimum(
   final long numberToFormat)
{
   final Map<String, NumberFormat> numberFormats = generateCompactNumberFormats(1);
   out.println(
      "Demonstrating Compact Number Formatting on long '" + numberToFormat
         + "' with 1 minimum fraction digit:");
   numberFormats.forEach((label, numberFormat) ->
      out.println("\t" +  label + ": " + numberFormat.format(numberToFormat))
   );
}
Demonstrating Compact Number Formatting on long '15' with 1 minimum fraction digit:
 Default: 15
 US/Long: 15
 UK/Short: 15
 UK/Long: 15
 FR/Short: 15
 FR/Long: 15
 DE/Short: 15
 DE/Long: 15
 IT/Short: 15
 IT/Long: 15
Demonstrating Compact Number Formatting on long '150' with 1 minimum fraction digit:
 Default: 150
 US/Long: 150
 UK/Short: 150
 UK/Long: 150
 FR/Short: 150
 FR/Long: 150
 DE/Short: 150
 DE/Long: 150
 IT/Short: 150
 IT/Long: 150
Demonstrating Compact Number Formatting on long '1500' with 1 minimum fraction digit:
 Default: 1.5K
 US/Long: 1.5 thousand
 UK/Short: 1.5K
 UK/Long: 1.5 thousand
 FR/Short: 1,5 k
 FR/Long: 1,5 millier
 DE/Short: 1.500
 DE/Long: 1,5 Tausend
 IT/Short: 1.500
 IT/Long: 1,5 mille
Demonstrating Compact Number Formatting on long '15000' with 1 minimum fraction digit:
 Default: 15.0K
 US/Long: 15.0 thousand
 UK/Short: 15.0K
 UK/Long: 15.0 thousand
 FR/Short: 15,0 k
 FR/Long: 15,0 mille
 DE/Short: 15.000
 DE/Long: 15,0 Tausend
 IT/Short: 15.000
 IT/Long: 15,0 mila
Demonstrating Compact Number Formatting on long '150000' with 1 minimum fraction digit:
 Default: 150.0K
 US/Long: 150.0 thousand
 UK/Short: 150.0K
 UK/Long: 150.0 thousand
 FR/Short: 150,0 k
 FR/Long: 150,0 mille
 DE/Short: 150.000
 DE/Long: 150,0 Tausend
 IT/Short: 150.000
 IT/Long: 150,0 mila
Demonstrating Compact Number Formatting on long '1500000' with 1 minimum fraction digit:
 Default: 1.5M
 US/Long: 1.5 million
 UK/Short: 1.5M
 UK/Long: 1.5 million
 FR/Short: 1,5 M
 FR/Long: 1,5 million
 DE/Short: 1,5 Mio.
 DE/Long: 1,5 Million
 IT/Short: 1,5 Mln
 IT/Long: 1,5 milione
Demonstrating Compact Number Formatting on long '15000000' with 1 minimum fraction digit:
 Default: 15.0M
 US/Long: 15.0 million
 UK/Short: 15.0M
 UK/Long: 15.0 million
 FR/Short: 15,0 M
 FR/Long: 15,0 million
 DE/Short: 15,0 Mio.
 DE/Long: 15,0 Millionen
 IT/Short: 15,0 Mln
 IT/Long: 15,0 milioni

As the example and output shown above demonstrate, use of NumberFormat.setMinimumFractionDigits(int) leads to compact number formatted output that is likely to be more aesthetically pleasing in many cases. There is a recent discussion "Compact Number Formatting and Fraction Digits" on the OpenJDK core-libs-dev mailing list that also discusses this ability to customize the compact number formatting output.

Wednesday, January 9, 2019

Proposal for Wider Range of Available Java Keywords

In the recent posts titled "We need more keywords, captain!" on the OpenJDK amber-spec-experts mailing list, Brian Goetz "proposes a possible move that will buy us some breathing room in the perpetual problem where the keyword-management tail wags the programming-model dog." His proposal is to "allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers."

Goetz points out in the original post that Section 3.9 ("Keywords") of the Java Language Specification spells out the current keywords in Java and that these chosen keywords have been stable since Java's inception with the only changes being the addition of the highlighted keywords shown below (assert in JDK 1.4, enum in JDK 1.5, and _ in JDK 1.9):

abstract   continue   for          new         switch
assert     default    if           package     synchronized
boolean    do         goto         private     this
break      double     implements   protected   throw
byte       else       import       public      throws
case       enum       instanceof   return      transient
catch      extends    int          short       try
char       final      interface    static      void
class      finally    long         strictfp    volatile
const      float      native       super       while
_ (underscore)

Goetz's post writes about "several tools at our disposal" that have been or could be used when the set of pre-established keywords are not "suitable for expressing all the things we might ever want our language to express." Goetz emphasizes this point, "The lack of reasonable options for extending the syntax of the language threatens to become a significant impediment to language evolution."

Goetz provides significant background explanation regarding the downsides of the "several tools" he had mentioned earlier in the post and states, "We need a new source of keyword candidates." Goetz then proposes to "allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers."

The Goetz post provides several examples to illustrate hyphenated keywords and how they might be used. Goetz emphasizes that this list is not in any way proposing these specific keywords for now, but is only providing them as illustrative examples. See the original post for brief descriptions of how some of these illustrative examples might be used.

  • non-null
  • non-final
  • package-private
  • public-read
  • null-checked
  • type-static
  • default-value
  • eventually-final
  • semi-final
  • exhaustive-switch
  • enum-class
  • annotation-class
  • record-class
  • this-class
  • this-return

Two other posts on the mailing list have already built on this original message. In a reply to his own message, Goetz writes that the expression switch could have (or still might be able to because switch expressions are currently a Preview Feature) support a keyword such as break-with rather than requiring developers to "disambiguate whether [a given break is a] labeled break or a value break." Guy Steele replied that he prefers break-return to break-with and that this particular single illustrative example "has [maybe] won me over to the idea of hyphenated keywords."

The Goetz post "We need more keywords, captain!" has seen some attention on Twitter as well. Sander Mak tweeted, "Another example of real-world programming language design by @BrianGoetz: http://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-January/000945.html ... Love the diligence and forward-looking nature of these #Java language design discussions." Bruno Borges tweeted a quote from that same post: "We are convinced that @Java has a long life ahead of it, and developers are excited about new features that enable to them to write more expressive and reliable code."

The hyphenated keyword approach seems promising. Regardless of what happens, the Goetz post makes for an interesting read regarding the difficulty of adding or reapplying keywords to a long-lived programming language concerned about backwards compatibility. The post is also offers a peek into the types of issues, constraints, and trade-offs language designers must make.

Tuesday, January 8, 2019

Explicitly Naming Automatic Java Modules

Nicolas Fränkel recently published the surprising post "A hard look at the state of Java modularization." In that post, Fränkel provides the results of his investigation into support available in the 29 libraries referenced in the blog post "20 popular Java libraries" for modules introduced with JDK 9. Fränkel's investigation aimed to identify which of these popular Java libraries was "modularized" (fully implemented module defined in module-info.java) or provided at least an "automatic module name" via MANIFEST.MF even if the library isn't modularized.

Of the 29 popular Java libraries investigated, Fränkel identified only two (SLF4J 1.8.0-beta2 and JAXB 2.3.1) that are fully modularized. Of the remaining 27 libraries that are not fully modularized, 12 do have an automatic module name defined. That means, of course, that 15 of the 29 libraries (just over 50%) do not have any explicit support for modularity introduced with JDK 9!

In the post "Automatic-Module-Name: Calling all Java Library Maintainers" a little over one year ago, Sander Mak (one of the authors of Java 9 Modularity) describes "what needs to be done to move the Java library ecosystem toward modules." Mak explains that "support for the Java module system can be incrementally added to libraries" and uses this post to "explain the first step ... to becoming a Java module." Mak writes:

This first step boils down to picking a module name, and adding it as Automatic-Module-Name: <module name> entry to the library's MANIFEST.MF. That's it. With this first step you make your library usable as Java module without moving the library itself to Java 9 or creating a module descriptor for the library, yet.

In the "Automatic-Module-Name: Calling all Java Library Maintainers" post, Mak also provides guidance for providing an automatic name for a module. He recommends picking an explicit name for the module rather than relying on the ModuleFinder-based name derivation algorithm (module named based on JAR filename). Mak references Stephen Colebourne's post "Java SE 9 - JPMS automatic modules," in which Colebourne concludes, "Community members must at all costs avoid publishing modular jar files that depend on filenames." Incidentally, Colebourne's post "Java SE 9 - JPMS module naming" provides additional guidance on module naming.

The naming of the automatic module is significant because later changes to that name will cause backwards incompatibilities for the library. It is also important to not have the module name collide with others' libraries' module names. The recommended way of doing this is to use the root package name contained within the module, assuming that package uses the typical Java package naming convention to ensure uniqueness.

Mak also outlines in his post some "potential issues you need to verify" before adding the Automatic-Module-Name entry to the MANIFEST.MF file to avoid "false expectations." See the "Sanity-Check Your Library" section of Mak's post for the full list and detailed description of these issues which include not using internal JDK types and not having any classes in the unnamed package.

Before concluding my post, I am going to briefly present the difference between an explicitly named automatic module and an implicitly named automatic module. For this, I'll be using a JAR file generated from the code example in my previous post "Parsing Value from StreamCorruptedException: invalid stream header Message." The Java source is not that important for my purposes here other than to point out that the main class used in my post has the package name dustin.utilities.io.

The generated JAR file, which I've called io-examples.jar is not modularized (does not have a module-info file). The jar tool's option --describe-module can be used to quickly determine the automatic module name of the JAR file.

The next two screen snapshots show the results of running jar with the --describe-module option against the JAR. The first screen snapshot indicates the results when the JAR has nothing in its MANIFEST.MF file to indicate an automatic module name. The result is a automatic module named after the JAR name. The second screen snapshot shows the results from running jar --describe-module against the almost identical JAR except for the addition of the attribute Automatic-Module-Name: dustin.utilities.io to the JAR's MANIFEST.MF file. In that case, the automatic module is named the explicitly provided name provided in the manifest file (dustin.utilities.io).

Fränkel opens his post with the assertion, "With the coming of Java 11, the latest Long-Term Support, I think it's a good time to take a snapshot of the state of modularization." I too think that recent and pending changes make it more important for all of us in the Java community to start understanding Java's built-in modularity and its implications and to take steps toward more complete modularity support.

Additional (Previously Referenced) Resources

Monday, January 7, 2019

The JDK 13 Train Has Left the Station

JDK 12 [Java SE 12 Platform (JSR 386)] is still in Rampdown Phase 1, but initial work on JDK 13 [Java SE 13 Platform (JSR 388)] has already begun. Draft 26 of the Java SE 12 specification was announced approximately 10 hours before Draft 2 of the Java SE 13 specification was announced.

There are already early access builds of JDK 13 available for Linux, macOS, Windows, and Alpine Linux. As of this writing, the current JDK 13 early access build is #2 (3 January 2019). The JDK 13 Early-Access Release Notes do not yet contain anything of significance.

There are ten change sets associated with Build 2 and another ten change sets associated with JDK 13 Early Access Build 1. There are 33 issues addressed with that build (688 total bugs associated with JDK 13 for "build fix" as this writing).

The temporary home of the Javadoc-based API documentation for JDK 13 is also available at https://download.java.net/java/early_access/jdk13/docs/api/.

The OpenJDK JDK 13 project page describes the project's status: "The development repositories are open for bug fixes, small enhancements, and JEPs as proposed and tracked via the JEP Process."

Thursday, January 3, 2019

Parsing Value from StreamCorruptedException: invalid stream header Message

It is a relatively common occurrence to see StreamCorruptedExceptions thrown with a "reason" that states, "invalid stream header" and then provides the first part of that invalid stream header. Frequently, a helpful clue for identifying the cause of that exception is to understand what the invalid stream header is because that explains what is unexpected and causing the issue.

The StreamCorruptedException only has two constructors, one that accepts no arguments and one that accepts a single String describing the exception's "reason". This tells us that the "invalid stream header: XXXXXXXX" messages (where XXXXXXXX represents various invalid header details) are provided by the code that instantiates (and presumably throws) these StreamCorruptedExceptions rather than by that exception class itself. This means that it won't always necessarily be the same formatted message encountered with one of these exceptions, but in most common cases, the format is the same with "invalid stream header: " followed by the first portion of that invalid stream header.

This exception is commonly thrown by an ObjectInputStream. The Javadoc for that class has some useful details that help explain why the "StreamCorruptedException: invalid stream header" is encountered. The class-level Javadoc states, "Only objects that support the java.io.Serializable or java.io.Externalizable interface can be read from streams." The Javadoc for the ObjectInputStream​(InputStream) constructor states (I added the emphasis), "Creates an ObjectInputStream that reads from the specified InputStream. A serialization stream header is read from the stream and verified."

As the quoted Javadoc explains, ObjectInputStream should be used with serialized data. Many of the cases of the "StreamCorruptedException: invalid stream header" message occur when a text file (such as HTML, XML, JSON, etc.) is passed to this constructor rather than a Java serialized file.

The following are examples of "ASCII" values derived from "invalid stream header" messages associated with StreamCorruptedExceptions and reported online.

Invalid Stream Header Value (HEX) Corresponding Integers Corresponding
"ASCII" Value
Online References / Examples
00000000 000 000 000 000 https://stackoverflow.com/questions/44479323/exception-in-thread-main-java-io-streamcorruptedexception-invalid-stream-head
0A0A0A0A 010 010 010 010

 

 

https://issues.jenkins-ci.org/browse/JENKINS-35197
0A0A3C68 010 010 060 104

<h

https://developer.ibm.com/answers/questions/201983/what-does-javaiostreamcorruptedexception-invalid-s/
20646520 032 100 101 032 de https://stackoverflow.com/questions/2622716/java-invalid-stream-header-problem
30313031 048 049 048 049 0101 https://stackoverflow.com/questions/48946230/java-io-streamcorruptedexception-invalid-stream-header-30313031
32303138 050 048 049 056 2018 https://stackoverflow.com/questions/49878481/jpa-invalid-stream-header-32303138
3C21444F 060 033 068 079 <!DO https://github.com/metasfresh/metasfresh/issues/1335
3c48544d 060 072 084 077 <HTM http://forum.spring.io/forum/spring-projects/integration/jms/70353-java-io-streamcorruptedexception-invalid-stream-header
3C6F626A 060 111 098 106 <obj  
3C787364 060 120 115 100 <xsd https://stackoverflow.com/questions/29769191/java-io-streamcorruptedexception-invalid-stream-header-3c787364
41434544 065 067 069 068 ACED https://stackoverflow.com/questions/36677022/java-io-streamcorruptedexception-invalid-stream-header-41434544
48656C6C 072 101 108 108 Hell https://stackoverflow.com/questions/28298366/java-io-streamcorruptedexception-invalid-stream-header-48656c6c
4920616D 073 032 097 109 I am https://stackoverflow.com/questions/34435188/java-io-streamcorruptedexception-invalid-stream-header-4920616d
54656D70 084 101 109 112 Temp https://stackoverflow.com/a/50669243
54657374 084 101 115 116 Test java.io.StreamCorruptedException: invalid stream header: 54657374
54686973 084 104 105 115 This https://stackoverflow.com/questions/28354180/stanford-corenlp-streamcorruptedexception-invalid-stream-header-54686973
64617364 100 097 115 100 dasd https://stackoverflow.com/questions/50451100/java-io-streamcorruptedexception-invalid-stream-header-when-writing-to-the-stdo?noredirect=1&lq=1
70707070 112 112 112 112 pppp https://stackoverflow.com/questions/32858472/java-io-streamcorruptedexception-invalid-stream-header-70707070
72657175 114 101 113 117 requ https://stackoverflow.com/questions/8534124/java-io-streamcorruptedexception-invalid-stream-header-72657175
7371007E 115 113 000 126 sq ~ https://stackoverflow.com/questions/2939073/java-io-streamcorruptedexception-invalid-stream-header-7371007e
77617161 119 097 113 097 waqa https://coderanch.com/t/278717/java/StreamCorruptedException-invalid-stream-header
7B227061 123 034 112 097 {"pa https://stackoverflow.com/questions/9986672/streamcorruptedexception-invalid-stream-header

The above examples show the "StreamCorruptedException: invalid stream header" message occurring for cases where input streams representing text were passed to the constructor that expects Java serialized format. The highlighted row is especially interesting. That entry ("ACED" in "ASCII" character representation) looks like what is expected in all files serialized by Java's default serialization, but it's not quite correct.

The "Terminal Symbols and Constants" section of the Java Object Serialization Specification tells us that java.io.ObjectStreamConstants defines a constant STREAM_MAGIC that is the "Magic number that is written to the stream header." The specification further explains that ObjectStreamConstants.STREAM_MAGIC is defined as (short)0xaced and this can be verified in Java code if desired. The reason that particular entry led to an error is that it should be the hexadecimal representation that is "ACED" rather than the translated "ASCII" character representation. In other words, for that particular case, it was actually literal text "ACED" that was in the first bytes rather than bytes represented by the hexadecimal "ACED" representation.

There are many ways to translate the hexadecimal representation provided in the "StreamCorruptedException: invalid stream header" message to see if it translates to text that means something. If it is text, one knows that he or she is already off to a bad start as a binary serialized file should be used instead of text. The characters in that text can provide a further clue as to what type of text file was being accidentally provided. Here is one way to translate the provided hexadecimal representation to "ASCII" text using Java (available on GitHub):

private static String toAscii(final String hexInput)
{
   final int length = hexInput.length();
   final StringBuilder ascii = new StringBuilder();
   final StringBuilder integers = new StringBuilder();
   for (int i = 0; i < length; i+=2)
   {
      final String twoDigitHex = hexInput.substring(i, i+2);
      final int integer = Integer.parseInt(twoDigitHex, 16);
      ascii.append((char)integer);
      integers.append(String.format("%03d", integer)).append(" ");
   }
   return hexInput + " ==> " + integers.deleteCharAt(integers.length()-1).toString() + " ==> " + ascii.toString();
}

Streams of text inadvertently passed to ObjectInputStream's constructor are not the only cause of "StreamCorruptedException: invalid stream header". In fact, any InputStream (text or binary) that doesn't begin with the expected "stream magic" bytes (0xaced) will lead to this exception.

Wednesday, January 2, 2019

Restarting Java's Raw String Literals Discussion

It was announced in December 2018 that raw string literals would be dropped from JDK 12. Now, in the new year, discussion related to the design of raw string literals in Java has begun again.

In the post "Raw string literals -- restarting the discussion" on the amber-spec-experts OpenJDK mailing list, Brian Goetz references the explanation for dropping raw string literals preview feature from JDK 12 and suggests "restart[ing] the design discussion." Goetz summarizes the previous design discussions and decisions and lessons learned from the first take on raw string literals, discusses some design questions and trade-offs to be made, and then calls for input on three specific types of observation data:

  • "Data that supports or refutes the claim that our primary use cases are embedded JSON, HTML, XML, and SQL."
  • "Use cases we've left out..."
  • "Data (either Java or non-Java) on the use of various flavors of strings (raw, multi-line, etc) in real codebases..."

Jim Laskey posted two messages with the title "Enhancing Java String Literals Round 2" to the same amber-spec-experts mailing list and references an HTML version and a PDF version of an "RTL2" document that aids in the discussion of "Take Two" of raw string literals. Laskey outlines a "series of critical decision points that should be given thought, if not answers, before we propose a new design."

A few of the major decisions to be made as raw string literals for Java are reconsidered include these discussed in the aforementioned posts are listed here, but many more are contained in the posts:

  • Which is really more important to developers: "raw text" or "multi-line strings"?
  • Which character makes for the best delimiter for most Java developers and Java use cases?
  • How should incidental spacing be handled?

There has already been some feedback on the amber-dev OpenJDK mailing list. Stephen Colebourne provides "Extended string literals feedback" and Bruno Borges recommends "special assignment rather [than] special delimiters."

I often see developers complaining about certain language and API decisions after the decisions have been implemented. For anyone with strong feelings about the subject of raw string literals and multi-line strings in Java, now is an opportunity to make one's voice heard and to possibly influence the final design that will come to Java at some point in the future. Discussion has also started on the Java subreddit in two threads: "Raw string literals -- restarting the discussion" and "New RSL Proposal".