Wednesday, June 20, 2018

Java's Ternary is Tricky with Autoboxing/Unboxing

The comments section of the DZone-syndicated version of my post "JDK 8 Versus JDK 10: Ternary/Unboxing Difference" had an interesting discussion regarding the "why" of the "fix" for how Java handles autoboxing/unboxing in conjunction with use of the ternary operator (AKA "conditional operator"). This post expands on that discussion with a few more details.

One of the points made in the discussion is that the logic for how primitives and reference types are handled in a ternary operator when autoboxing or unboxing is required can be less than intuitive. For compelling evidence of this, one only needs to look at the number of bugs written for perceived problems with Java's conditional operator's behavior when autoboxing and unboxing are involved:

  • JDK-6211553 : Unboxing in conditional operator might cause null pointer exception
    • The "EVALUATION" section states, "This is not a bug." It then explains that the observed behavior that motivated the writing of the bug "is very deliberate since it makes the type system compositional." That section also provides an example of a scenario that justifies this.
  • JDK-6303028 : Conditional operator + autoboxing throws NullPointerException
    • The "EVALUATION" section states, "This is not a bug." This section also provides this explanation:
      The type of the conditional operator 
      
      (s == null) ? (Long) null : Long.parseLong(s)
      
      is the primitive type long, not java.lang.Long.
      This follows from the JLS, 3rd ed, page 511:
      
      "Otherwise, binary numeric promotion (5.6.2) is applied to the operand
      types, and the type of the conditional expression is the promoted type of the
      second and third operands. Note that binary numeric promotion performs
      unboxing conversion (5.1.8) and value set conversion (5.1.13)."
      
      In particular, this means that (Long)null is subjected to unboxing conversion.
      This is the source of the null pointer exception.
      
  • JDK-8150614 : conditional operators, null argument only for return purpose, and nullpointerexception
    • The "Comments" section explains "The code is running afoul of the complicated rules for typing of the ?: operator" and references the pertinent section of the Java Language Specification for the current version at time of that writing (https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.25).
    • I like the explanation on this one as well: "The code in the bug has one branch of the ?: typed as an Integer (with the 'replace' variable") and the other branch typed as an int from Integer.parseInt. In that case, first an unboxing Integer -> int conversion will occur before a boxing to the final result, leading to the NPE. To avoid this, case the result of parseInt to Integer."
    • The "Comments" section concludes, "Closing as not a bug."
  • JDK-6777143 : NullPointerException occured at conditional operator
    • The "EVALUATION" section of this bug report provides interesting explanation with a historical perspective:
      It is because of NPEs that JLS 15.25 says 'Note that binary numeric promotion performs unboxing conversion'. The potential for NullPointerExceptions and OutOfMemoryErrors in 1.5 where they could never have occurred in 1.4 was well known to the JSR 201 Expert Group. It could have made unboxing conversion from the null type infer the target type from the context (and have the unboxed value be the default value for that type), but inference was not common before 1.5 expanded the type system and it's certainly not going to happen now.
  • JDK-6360739 : Tertiary operator throws NPE due to reduntant casting

It's no wonder it's not intuitive to many of us! Section 15.25 ("Conditional Operator ? :") of the Java Language Specification is the defining authority regarding the behavior of the ternary operator with regards to many influences, including autoboxing and unboxing. This is the section referenced in several of the bug reports cited above and in some of the other resources that I referenced in my original post. It's worth noting that this section of the PDF version of the Java SE 10 Language Specification is approximately 9 pages!

In the DZone comments on my original post, Peter Schuetze and Greg Brown reference Table 15.25-D from the Java Language Specification for the most concise explanation of the misbehavior in JDK 8 that was rectified in JDK 10. I agree with them that this table is easier to understand than the accompanying text illustrated by the table. That table shows the type of the overall ternary operation based on the types of the second expression and third expression (where second expression is the expression between the ? and : and the third expression is the expression following the : as shown next):

    first expression ? second expression : third expression

The table's rows represent the type of the second expression and the table's columns represent the type of the third expression. One can find where the types meet in the table to know the overall type of the ternary operation. When one finds the cell of the table that correlates to row of primitive double and column of reference Double, the cell indicates that the overall type is primitive double. This is why the example shown in my original post should throw a NullPointerException, but was in violation of the specification in JDK 8 when it did not do so.

I sometimes wonder if autoboxing and unboxing are a case of the "cure being worse than the disease." However, I have found autoboxing and unboxing to be less likely to lead to subtle errors if I'm careful about when and how I use those features. A J articulates it well in his comment on the DZone version of my post: "The practical takeaway I got from this article is: when presented with an incomprehensible error, if you see that you are relying on autoboxing in that area of code (i.e., automatic type conversion), do the type conversion yourself manually. Then you will be sure the conversion is being done right."

Saturday, June 16, 2018

JDK 11: Beginning of the End for Java Serialization?

In the blog post "Using Google's Protocol Buffers with Java," I quoted Josh Bloch's Third Edition of Effective Java, in which he wrote, "There is no reason to use Java serialization in any new system you write." Bloch recommends using "cross-platform structured-data representations" instead of Java's deserialization. The proposed JDK 11 API documentation will include a much stronger statement about use of Java deserialization and this is briefly covered in this post.

The second draft of the "Java SE 11 (18.9) (JSR 384)" specification includes an "A2 Annex" called "API Specification differences" that includes the changes coming to the Javadoc-based documentation for package java.io. The new java.io package documentation will include this high-level warning comment:

Warning: Deserialization of untrusted data is inherently dangerous and should be avoided. Untrusted data should be carefully validated according to the "Serialization and Deserialization" section of the Secure Coding Guidelines for Java SE.

At the time of the writing of this post, the referenced Secure Coding Guidelines for Java SE states that it is currently Version 6.0 and was "Updated for Java SE 9."

The intended package-level documentation for package java.io in JDK 11 will also provide links to the following additional references (but likely to be JDK 11-based references):

The former reference link to the "Java Object Serialization" (JDK 8) document will be removed from java.io's package documentation.

In addition to the java.io package documentation that is being updated in JDK 11 related to the dangers of Java deserialization, the java.io.Serializable interface's Javadoc comment is getting a similar high-level warning message.

These changes to the Javadoc-based documentation in JDK 11 are not surprising given various announcements over the past few years related to Java serialization and deserialization. "RFR 8197595: Serialization javadoc should link to security best practices" specifically spelled out the need to add this documentation. A recent InfoWorld article called "Oracle plans to dump risky Java serialization" and an ADT Magazine article called "Removing Serialization from Java Is a 'Long-Term Goal' at Oracle" quoted Mark Reinhold's statement at Devoxx UK 2018 that adding serialization to Java was a "horrible mistake in 1997."

There has been talk of removing Java serialization before. JEP 154: Remove Serialization was created with the intent to "deprecate, disable, and ultimately remove the Java SE Platform's serialization facility." However, that JEP's status is now "Closed / Withdrawn." Still, as talk of removing Java serialization picks up, it seems prudent to consider alternatives to Java serialization for all new systems, which is precisely what Bloch recommends in Effective Java's Third Edition. All this being stated, Apostolos Giannakidis has written in the blog post "Serialization is dead! Long live serialization!" that "deserialization vulnerabilities are not going away" because "Java's native serialization is not the only flawed serialization technology."

Additional References

Thursday, June 14, 2018

JDK 8 BigInteger Exact Narrowing Conversion Methods

In the blog post "Exact Conversion of Long to Int in Java," I discussed using Math.toIntExact(Long) to exactly convert a Long to an int or else throw an ArithmeticException if this narrowing conversion is not possible. That method was introduced with JDK 8, which also introduced similar narrowing conversion methods to the BigInteger class. Those BigInteger methods are the topic of this post.

BigInteger had four new "exact" methods added to it in JDK 8:

As described above, each of these four "exact" methods added to BigInteger with JDK 8 allow for the BigInteger's value to be narrowed to the data type in the method name, if that is possible. Because all of these types (byte, short, int, and long) have smaller ranges than BigInteger, it's possible in any of these cases to have a value in BigDecimal with a magnitude larger than that which can be represented by any of these four types. In such a case, all four of these "Exact" methods throw an ArithmeticException rather than quietly "forcing" the bigger value into the smaller representation (which is typically a nonsensical number for most contexts).

Examples of using these methods can be found on GitHub. When those examples are executed, the output looks like this:

===== Byte =====
125 => 125
126 => 126
127 => 127
128 => java.lang.ArithmeticException: BigInteger out of byte range
129 => java.lang.ArithmeticException: BigInteger out of byte range
===== Short =====
32765 => 32765
32766 => 32766
32767 => 32767
32768 => java.lang.ArithmeticException: BigInteger out of short range
32769 => java.lang.ArithmeticException: BigInteger out of short range
===== Int =====
2147483645 => 2147483645
2147483646 => 2147483646
2147483647 => 2147483647
2147483648 => java.lang.ArithmeticException: BigInteger out of int range
2147483649 => java.lang.ArithmeticException: BigInteger out of int range
===== Long =====
9223372036854775805 => 9223372036854775805
9223372036854775806 => 9223372036854775806
9223372036854775807 => 9223372036854775807
9223372036854775808 => java.lang.ArithmeticException: BigInteger out of long range
9223372036854775809 => java.lang.ArithmeticException: BigInteger out of long range

The addition of these "exact" methods to BigInteger with JDK 8 is a welcome one because errors associated with numeric narrowing and overflow can be subtle. It's nice to have an easy way to get an "exact" narrowing or else have the inability to do that narrowing exactly made obvious via an exception.

Tuesday, June 12, 2018

JDK 8 Versus JDK 10: Ternary/Unboxing Difference

A recent Nicolai Parlog (@nipafx) tweet caught my attention because it referenced an interesting StackOverflow discussion on a changed behavior between JDK 8 and JDK 10 and asked "Why?" The issue cited on the StackOverflow thread by SerCe ultimately came down to the implementation being changed between JDK 8 and JDK 10 to correctly implement the Java Language Specification.

The following code listing is (very slightly) adapted from the original example provided by SerCe on the StackOverflow thread.

Adapted Example That Behaves Differently in JDK 10 Versus JDK 8

public static void demoSerCeExample()
{
   try
   {
      final Double doubleValue = false ? 1.0 : new HashMap<String, Double>().get("1");
      out.println("Double Value: " + doubleValue);
   }
   catch (Exception exception)
   {
      out.println("ERROR in 'demoSerCeExample': " + exception);
   }
}

When the above code is compiled and executed with JDK 8, it generates output like this: Double Value: null

When the above code is compiled and executed with JDK 10, it generates output like this: ERROR in 'demoSerCeExample': java.lang.NullPointerException

In JDK 8, the ternary operator returned null for assigning to the local variable doubleValue, but in JDK 10 a NullPointerException is instead thrown for the same ternary statement.

Two tweaks to this example lead to some interesting observations. First, if the literal constant 1.0 expressed in the ternary operator is specified instead as Double.valueOf(1.0), both JDK 8 and JDK 10 set the local variable to null rather than throwing a NullPointerException. Second, if the local variable is declared with primitive type double instead of reference type Double, the NullPointerException is always thrown regardless of Java version and regardless of whether Double.valueOf(double) is used. This second observation makes sense, of course, because no matter how the object or reference is handled by the ternary operator, it must be dereferenced at some point to be assigned to the primitive double type and that will always result in a NullPointerException in the example.

The following table summarizes these observations:

Complete Ternary Statement Setting of Local Variable doubleValue
JDK 8 JDK 10
Double doubleValue
   =  false
    ? 1.0
    : new HashMap<String, Double>().get("1");
  
null NullPointerException
double doubleValue
   =  false
    ? 1.0
    : new HashMap<String, Double>().get("1");
  
NullPointerException NullPointerException
Double doubleValue
   =  false
    ? Double.valueOf(1.0)
    : new HashMap<String, Double>().get("1");
  
null null
double doubleValue
   =  false
    ? Double.valueOf(1.0)
    : new HashMap<String, Double>().get("1");
  
NullPointerException NullPointerException

The only approach that avoids NullPointerException in both versions of Java for this general ternary example is the version that declares the local variable as a reference type Double (no unboxing is forced) and uses Double.valueOf(double) so that reference Double is used throughout the ternary rather than primitive double. If the primitive double is implied by specifying only 1.0, then the Double returned by the Java Map is implicitly unboxed (dereferenced) in JDK 10 and that leads to the exception. According to Brian Goetz, JDK 10 brings the implementation back into compliance with the specification.

Exact Conversion of Long to Int in Java

With all the shiny things (lambda expressions, streams, Optional, the new Date/Time API, etc.) to distract my attention that came with JDK 8, I did not pay much attention to the addition of the method Math.toIntExact(). However, this small addition can be pretty useful in its own right.

The Javadoc documentation for Math.toIntExact​(long) states, "Returns the value of the long argument; throwing an exception if the value overflows an int." This is particularly useful in situations where one is given or already has a Long and needs to call an API that expects an int. It's best, of course, if the APIs could be changed to use the same datatype, but sometimes this is out of one's control. When one needs to force a Long into an int there is potential for integer overflow because the numeric value of the Long may have a greater magnitude than the int can accurately represent.

If one is told that a given Long will never be larger than what an int can hold, the static method Math.toIntExact(Long) is particularly useful because it will throw an unchecked ArithmeticException if that "exceptional" situation arises, making it obvious that the "exceptional" situation occurred.

When Long.intValue() is used to get an integer from a Long, no exception is thrown if integer overflow occurs. Instead, an integer is provided, but this value will rarely be useful due to the integer overflow. In almost every conceivable case, it's better to encounter a runtime exception that alerts one to the integer overflow than to have the software continue using the overflow number incorrectly.

As a first step in illustrating the differences between Long.intValue() and Math.toIntExact(Long), the following code generates a range of Long values from 5 less than Integer.MAX_VALUE to 5 more than Integer.MAX_VALUE.

Generating Range of Longs that Includes Integer.MAX_VALUE

/**
 * Generate {@code Long}s from range of integers that start
 * before {@code Integer.MAX_VALUE} and end after that
 * maximum integer value.
 *
 * @return {@code Long}s generated over range includes
 *    {@code Integer.MAX_VALUE}.
 */
public static List<Long> generateLongInts()
{
   final Long maximumIntegerAsLong = Long.valueOf(Integer.MAX_VALUE);
   final Long startingLong = maximumIntegerAsLong - 5;
   final Long endingLong = maximumIntegerAsLong + 5;
   return LongStream.range(startingLong, endingLong).boxed().collect(Collectors.toList());
}

The next code listing shows two methods that demonstrate the two previously mentioned approaches for getting an int from a Long.

Using Long.intValue() and Math.toIntExact(Long)

/**
 * Provides the {@code int} representation of the provided
 * {@code Long} based on an invocation of the provided
 * {@code Long} object's {@code intValue()} method.
 *
 * @param longRepresentation {@code Long} for which {@code int}
 *    value extracted with {@code intValue()} will be returned.
 * @return {@code int} value corresponding to the provided
 *    {@code Long} as provided by invoking the method
 *    {@code intValue()} on that provided {@code Long}.
 * @throws NullPointerException Thrown if the provided long
 *    representation is {@code null}.
 */
public static void writeLongIntValue(final Long longRepresentation)
{
   out.print(longRepresentation + " =>       Long.intValue() = ");
   try
   {
      out.println(longRepresentation.intValue());
   }
   catch (Exception exception)
   {
      out.println("ERROR - " + exception);
   }
}

/**
 * Provides the {@code int} representation of the provided
 * {@code Long} based on an invocation of {@code Math.toIntExact(Long)}
 * on the provided {@code Long}.
 *
 * @param longRepresentation {@code Long} for which {@code int}
 *    value extracted with {@code Math.toIntExact(Long)} will be
 *    returned.
 * @return {@code int} value corresponding to the provided
 *    {@code Long} as provided by invoking the method
 *    {@code Math.toIntExact)Long} on that provided {@code Long}.
 * @throws NullPointerException Thrown if the provided long
 *    representation is {@code null}.
 * @throws ArithmeticException Thrown if the provided {@code Long}
 *    cannot be represented as an integer without overflow.
 */
public static void writeIntExact(final Long longRepresentation)
{
   out.print(longRepresentation + " => Math.toIntExact(Long) = ");
   try
   {
      out.println(Math.toIntExact(longRepresentation));
   }
   catch (Exception exception)
   {
      out.println("ERROR: " + exception);
   }
}

When the above code is executed with the range of Longs constructed in the earlier code listing (full code available on GitHub), the output looks like this:

2147483642 =>       Long.intValue() = 2147483642
2147483642 => Math.toIntExact(Long) = 2147483642
2147483643 =>       Long.intValue() = 2147483643
2147483643 => Math.toIntExact(Long) = 2147483643
2147483644 =>       Long.intValue() = 2147483644
2147483644 => Math.toIntExact(Long) = 2147483644
2147483645 =>       Long.intValue() = 2147483645
2147483645 => Math.toIntExact(Long) = 2147483645
2147483646 =>       Long.intValue() = 2147483646
2147483646 => Math.toIntExact(Long) = 2147483646
2147483647 =>       Long.intValue() = 2147483647
2147483647 => Math.toIntExact(Long) = 2147483647
2147483648 =>       Long.intValue() = -2147483648
2147483648 => Math.toIntExact(Long) = ERROR: java.lang.ArithmeticException: integer overflow
2147483649 =>       Long.intValue() = -2147483647
2147483649 => Math.toIntExact(Long) = ERROR: java.lang.ArithmeticException: integer overflow
2147483650 =>       Long.intValue() = -2147483646
2147483650 => Math.toIntExact(Long) = ERROR: java.lang.ArithmeticException: integer overflow
2147483651 =>       Long.intValue() = -2147483645
2147483651 => Math.toIntExact(Long) = ERROR: java.lang.ArithmeticException: integer overflow

The highlighted rows indicate the code processing a Long with value equal to Integer.MAX_VALUE. After that, the Long representing one more than Integer.MAX_VALUE is shown with the results of attempting to convert that Long to an int using Long.intValue() and Math.toIntExact(Long). The Long.intValue() approach encounters an integer overflow, but does not throw an exception and instead returns the negative number -2147483648. The Math.toIntExact(Long) method does not return a value upon integer overflow and instead throws an ArithmeticException with the informative message "integer overflow."

The Math.toIntExact(Long) method is not as significant as many of the features introduced with JDK 8, but it can be useful in avoiding the types of errors related to integer overflow that can sometimes be tricky to diagnose.

Monday, June 11, 2018

Peeking Inside Java Streams with Stream.peek

For a Java developer new to JDK 8-introduced pipelines and streams, the peek(Consumer) method provided by the Stream interface can be a useful tool to help visualize how streams operations behave. Even Java developers who are more familiar with Java streams and aggregation operations may occasionally find Stream.peek(Consumer) useful for understanding the implications and interactions of complex intermediate stream operations.

The Stream.peek(Consumer) method expects a Consumer, which is essentially a block of code that accepts a single argument and returns nothing. The peek(Consumer) method returns the same elements of the stream that were passed to it, so there will be no changes to the contents of the stream unless the block of code passed to the peek(Consumer) method mutates the objects in the stream. It's likely that the vast majority of the uses of Stream.peek(Consumer) are read-only printing of the contents of the objects in the stream at the time of invocation of that method.

The Javadoc-based API documentation for Stream.peek(Consumer) explains this method's behaviors in some detail and provides an example of its usage. That example is slightly adapted in the following code listing:

final List<String> strings
   = Stream.of("one", "two", "three", "four")
      .peek(e-> out.println("Original Element: " + e))
      .filter(e -> e.length() > 3)
      .peek(e -> out.println("Filtered value: " + e))
      .map(String::toUpperCase)
      .peek(e -> out.println("Mapped value: " + e))
      .collect(Collectors.toList());
out.println("Final Results: " + strings);

When the above code is executed, its associated output looks something like this:

Original Element: one
Original Element: two
Original Element: three
Filtered value: three
Mapped value: THREE
Original Element: four
Filtered value: four
Mapped value: FOUR
Final Results: [THREE, FOUR]

The output tells the story of the stream operations' work on the elements provided to them. The first invocation of the intermediate peek operation will write each element in the original stream out to system output with the prefix "Original Element:". Instances of the intermediate peek operation that occur later are not executed for every original String because each of these peek operations occur after at least once filtering has taken place.

The peek-enabled output also clearly shows the results of executing the intermediate operation map on each String element to its upper case equivalent. The collect operation is a terminating operation and so no peek is placed after that. Strategic placement of peek operations provides significant insight into the stream processing that takes place.

The Javadoc for Stream.peek(Consumer) states that "this method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline." This is exactly what the example and output shown above demonstrate and is likely the most common application of Stream.peek(Consumer).

Stream.peek(Consumer)'s Javadoc documentation starts with this descriptive sentence, "Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as elements are consumed from the resulting stream." In the previous example, the action performed on each element as it was consumed was to merely write its string representation to standard output. However, the action taken can be anything that can be specified as a Consumer (any code block accepting a single argument and returning no arguments). The next example demonstrates how peek(Consumer) can even be used to change contents of objects on the stream.

In the first example in this post, peek(Consumer) could not change the stream elements because those elements were Java Strings, which are immutable. However, if the stream elements are mutable, the Consumer passed to peek(Consumer) can alter the contents of those elements. To illustrate this, I'll use the simple class MutablePerson shown next.

MutablePerson.java

package dustin.examples.jdk8.streams;

/**
 * Represents person whose name can be changed.
 */
public class MutablePerson
{
   private String name;

   public MutablePerson(final String newName)
   {
      name = newName;
   }

   public String getName()
   {
      return name;
   }

   public void setName(final String newName)
   {
      name = newName;
   }

   @Override
   public String toString()
   {
      return name;
   }
}

The next code listing shows how Stream.peek(Consumer) can change the results of the stream operation when the elements in that stream are mutable.

final List<MutablePerson> people
   = Stream.of(
      new MutablePerson("Fred"),
      new MutablePerson("Wilma"),
      new MutablePerson("Barney"),
      new MutablePerson("Betty"))
   .peek(person -> out.println(person))
   .peek(person -> person.setName(person.getName().toUpperCase()))
   .collect(Collectors.toList());
out.println("People: " + people);

When the above code is executed, it produces output that looks like this:

Fred
Wilma
Barney
Betty
People: [FRED, WILMA, BARNEY, BETTY]

This example shows that the Consumer passed to peek did change the case of the peoples' names to all uppercase. This was only possible because the objects being processed are mutable. Some have argued that using peek to mutate the elements in a stream might be an antipattern and I find myself uncomfortable with this approach (but I also generally don't like having methods' arguments be "output parameters"). The name of the peek method advertises one's just looking (and not touching), but the Consumer argument it accepts advertises that something could be changed (Consumer's Javadoc states, "Unlike most other functional interfaces, Consumer is expected to operate via side-effects"). The blog post "Idiomatic Peeking with Java Stream API" discusses potential issues associated with using Stream.peek(Consumer) with mutating operations.

Steam.peek(Consumer) is a useful tool for understanding how stream operations are impacting elements.

Saturday, June 9, 2018

[JDK 11] Class Loader Hierarchy Details Coming to jcmd

I've been a fan of the diagnostic command-line tool jcmd since hearing about jcmd at JavaOne 2012. I've used this tool extensively since then and have blogged multiple times about this tool:

After numerous years of developing with Java, it's my opinion that the classloader is the source of some of the most difficult defects encountered during development and debugging. Given this observation and given my interest in jcmd, I am very interested in JDK-8203682 ["Add jcmd 'VM.classloaders' command to print out class loader hierarchy, details"].

The "Description" for JDK-8203682 states, "It would be helpful, as a complement to VM.classloader_stats, to have a command to print out the class loader hierarchy and class loader details." In other words, this command to be added to jcmd would include display of classloaders in hierarchical fashion similar to that which classes are displayed by jcmd's VM.class_hierarchy command.

JDK-8203682 shows its "Status" as "Resolved" and its "Fix Version" as "11". JDK-8203682 contains three text file attachments that depict the output of jcmd <pid> VM.classloaders: example-with-classes.txt, example-with-classes-verbose.txt, and example-with-reflection-and-noinflation.txt. Additional information is available in the announcement of the change set and in the change set itself.

When dealing with classloader-related issues in Java, any details can be helpful. The addition of the VM.classloaders command to jcmd will make this command-line tool even more valuable and insightful.

Thread Methods destroy() and stop(Throwable) Removed in JDK 11

The message "RFR(s): 8204243: remove Thread.destroy() and Thread.stop(Throwable)" by @DrDeprecator (Stuart Marks) on the core-libs-dev OpenJDK mailing list is a request for review (RFR) of a change set associated with JDK-8204243 ["remove Thread.destroy() and Thread.stop(Throwable)"]. Both the bug report and the mailing list message describe the history of these two referenced Thread methods and explain that neither method really does anything useful.

The JDK 10 Javadoc API documentation for java.lang.Thread shows six methods on the Thread class that are deprecated, three of which are explicitly marked for removal. The table below summarizes these deprecated Thread methods.

Methods Deprecated in java.lang.Thread as of JDK 10
MethodDeprecated
Since
For
Removal?
JDK 10
Status
countStackFrames() 1.2 Yes Depends on deprecated suspend()
destroy() 1.5 Yes Throws NoSuchMethodError since inception
(never implemented)
resume() 1.2 No "Exists solely for use with suspend()"
stop() 1.2 No "This method is inherently unsafe."
stop(Throwable) 1.2 Yes Throws UnsupportedOperationException since JDK 8
suspend() 1.2 No "This method ... is inherently deadlock-prone."

It now appears that two of the three Thread methods that are deprecated and marked for removal will be removed with JDK 11. Both methods Thread.destroy() and Thread.stop(Throwable) should be completely removed as of JDK 11. The destroy() method has never done anything except throw the NoSuchMethodError and the stop(Throwable) method hasn't done anything except throw UnsupportedOperationException since JDK 8. Good riddance to these methods!

Additional References

Thursday, June 7, 2018

JDK 9/10/11: Side Effects from += on Java String

The question "Why does `array[i++%n] += i+" "` give different results in Java 8 and Java 10?" was posted earlier this week on StackOverflow.com. It points to a bug in the Java compiler that is present in JDK 9 and later, but is not present in JDK 8.

As explained on the StackOverflow thread, Didier L provided a simple example of Java code that reproduces this issue. That is adapted in the code listing shown next.

package dustin.examples.strings;

import static java.lang.System.out;

/**
 * Example demonstrating JDK-8204322 and adapted from Didier L's
 * original example (https://stackoverflow.com/q/50683786).
 */
public class StringConcatenationBug
{
   static void didierLDemonstration()
   {
      final String[] array = {""};
      array[generateArrayIndex()] += "a";
   }

   static int generateArrayIndex()
   {
      out.println("Array Index Evaluated");
      return 0;
   }

   public static void main(final String[] arguments)
   {
      didierLDemonstration();
   }
}

Reading the code shown above, one would expect to see the string "Array Index Evaluated" displayed once if this class's main(String[]) function was executed. With JDK 8, that was the case, but since JDK 9, it has not been the case. The next screen snapshot demonstrates this. The examples shown in the screen snapshot show that when the class is compiled with javac's -source and -target flags set to "8", the string is shown only once when the compiled class is executed. However, when javac's -source and -target flags are set to "9", the string is shown twice when the compiled class is executed.

This bug exists in JDK 9, JDK 10, and JDK 11. Olivier Grégoire has described this bug, "The issue seems to be limited to the string concatenation and assignment operator (+=) with an expression with side effect(s) as the left operand."

JDK-8204322 ["'+=' applied to String operands can provoke side effects"] has been written for this bug, has been resolved, and its resolution is targeted currently for JDK 11. The bug report describes the problem, "When using the += operator, it seems that javac duplicates the code before the +=." It also explains that code written like array[i++%n] += i + " "; is compiled effectively to code like array[i++%n] = array[i++%n] + i + " ";. Jan Lahoda's comment on the bug describes why it occurs. Aleksey Shipilev has requested that this fix be backported to JDK 10 and it appears that it will be via JDK-8204340.

Additional background information regarding this bug can be found in the previously mentioned StackOverflow thread, in the related StackOverflow chat, and on the OpenJDK compiler-dev mailing list threads "Compiler bug about string concatenation" and "RFR: 8204322: '+=' applied to String operands can provoke side effects".

Wednesday, May 30, 2018

JEP 181, JEP 315, and JEP 333 Proposed to Target JDK 11

Three more Java Enhancement Proposals were proposed for targeting JDK 11 today. The three JEPs are JEP 181 ["http://openjdk.java.net/jeps/181"], JEP 315 ["JEP 315: Improve Aarch64 Intrinsics"], and JEP 333 ["JEP 333: ZGC: A Scalable Low-Latency Garbage Collector (Experimental)"]. The targeting of these three JEPs for JDK 11 was announced today on the OpenJDK jdk-dev mailing list in the respective posts "JEP proposed to target JDK 11: 181: Nest-Based Access Control", "JEP proposed to target JDK 11: 315: Improve Aarch64 Intrinsics", and "JEP proposed to target JDK 11: 333: ZGC: A Scalable Low-Latency Garbage Collector (Experimental)". Each JEP proposal is open for roughly one week and, if there are no objections, will be targeted officially to JDK 11 next week.

The purpose of JEP 181 is to "introduce nests" and the JEP's "Summary" states that "nests" are significant because they "allow classes that are logically part of the same code entity, but which are compiled to distinct class files, to access each other's private members without the need for compilers to insert accessibility-broadening bridge methods." A nice overview of JEP 181 can be found in the article "Java Nestmates Makes Progress," in which author Ben Evans describes this JEP as "a technical enhancement to the platform that pays off a 20 year old architectural debt introduced in Java 1.1."

The purpose of JEP 315, according to its "Summary," is to "improve the existing string and array intrinsics, and implement new intrinsics for the java.lang.Math sin, cos and log functions, on AArch64 processors."

The experimental JEP 333 introduces the "Z Garbage Collector," which the JEP's "Summary" section states is "also known as ZGC" and "is a scalable low-latency garbage collector." I like the "At a glance" sentence in the "Description" section that explains, "ZGC is a concurrent, single-generation, region-based, NUMA-aware, compacting collector. Stop-the-world phases are limited to root scanning, so GC pause times do not increase with the size of the heap or the live set." The JEP also explicitly states, "It is not a goal to provide working implementations for platforms other than Linux/x64." Important limitations of the initial version of ZGC are spelled out in the aptly named "Limitations" section which states, "The initial experimental version of ZGC will not have support for class unloading." Another limitation listed in that same section is, "ZGC will initially not have support for JVMCI (i.e. Graal)."

With today's announcements, there are now four JEPs waiting approval to be targeted for JDK 11. Besides the three covered in this post, JEP 330 had its review process for targeting JDK 11 extended to this week. Nine JEPs have already been targeted to JDK 11 as of this writing.

Tuesday, May 29, 2018

Deferred Execution with Java's Supplier

In the third chapter of his 2014 book "Java SE 8 for the Really Impatient: Programming with Lambdas," Cay Horstmann writes, "The point of all lambdas is deferred execution." He adds, "After all, if you wanted to execute some code right now, you'd do that, without wrapping it inside a lambda." Horstmann then lists several examples of the "many reasons for executing code later" and his last listed reason is "Running the code only when necessary." In this post, I look at some examples of this from the JDK that use Supplier to do exactly that: to execute a "code block" represented as a lambda expression "only when necessary."

When JDK 8 introduced lambda expressions, it also introduced several standard functional interfaces in its java.util.function package. That package's Javadoc documentation states, "Functional interfaces provide target types for lambda expressions and method references." The package contains standard functional interfaces such as Predicate (accepts single argument and represents boolean), Function (accepts single argument and provides single result), Consumer (accepts single argument and does not provide a result), and Supplier (accepts no arguments and represents/supplies single result). [As a side note, the blog post "Java 8 Friday: The Dark Side of Java 8" provides a highly useful "decision tree to [determine] ... the [standard functional interface] you're looking for" when trying to determine which standard functional interface to use (or to roll your own).]

In this post, I'll be focusing on JDK uses of Supplier, implying that the examples covered here are based on JDK-provided methods whose "target types" accept no arguments and supply a single result. These examples will only invoke the single get() method associated with the provided Supplier "when necessary." Note that all JDK uses of Supplier can be found in Supplier's Javadoc-generated HTML representation by clicking on the "Use" link.

Deferring Potential Expensive Operations Until Known to Be Necessary to Log Them

A nice benefit achieved with Supplier-provided deferred execution (sometimes also called "lazy evaluation") is in deferring of an expensive operation for generating a log message until it is known that the result of that expensive operation will actually be logged. In other words, the expensive operation will only be invoked when it is known that it is necessary to do so because the result will be logged. I have blogged about this before in the posts "Java 8 Deferred Invocation with Java Util Logging" and "Better Performing Non-Logging Logger Calls in Log4j2."

Before the modern Java logging libraries and frameworks featured these Supplier-accepting APIs, the common way to avoid incurring expensive operations whose results would not be logged anyway was to use API-based "guard" statements, but these can clutter the code and reduce readability. There are now numerous resources on how to use Supplier-based logging framework APIs to ensure that an expensive operation that might potentially be logged is not actually executed until it is known to be necessary. These resources include my two just-mentioned posts on java.util.logging and Log4j 2 as well as "Writing clean logging code using Java 8 lambdas," "Lazy logging in Java 8," and "Log4j 2 and Lambda Expressions."

The JDK supports this Supplier-powered deferred execution for logging in both its java.lang.SystemLogger "log" methods and in numerous java.util.logging.Logger methods.

Deferring Alternative Calculation for Optional Until Known to Be Necessary

The JDK 8-introduced Optional class provides several methods that accept a Supplier parameter. In all cases, the intent is that the alternative result that the Supplier can provide will only be calculated when it's known to be necessary to calculate it. In other words, the Supplier.get() is only invoked if the Optional is "empty." If Optional is not empty, all of these methods will return the value held in the Optional and won't invoke the Supplier potentially costly operation in those cases. The following list shows the methods on Optional that accept a Supplier as of JDK 10:

Deferring Handling of Undesired null Values Until Known to Be Necessary

JDK 8 added two Supplier-powered methods to the Objects class that allow for an unwanted null (typically on a method parameter) to be detected and an appropriate response supplied only when determined that a null was indeed encountered. I wrote about one of these methods, Objects.requireNonNullElseGet, in my blog post "JDK 9: NotNullOrElse Methods Added to Objects Class." This method will only execute the provided Supplier by calling its get() method when it is determined that that first parameter passed to the method is indeed null. If the first parameter is not null, the Supplier is never executed.

The second Supplier-accepting method added to Objects with JDK 9 is requireNonNull​(T obj, Supplier<String>). This method allows a Supplier to supply a String to be used in a "customized NullPointerException" thrown when the first parameter is null. If the first parameter is not null, the Supplier's get() never needs to be invoked because an exception does not need to be thrown. A very similar method exists on the Objects class that accepts a String instead of a Supplier. This requireNonNull​(T, String) method may be preferable in some cases where the cost of the String generation is deemed less than the cost of generating the supplier. If, for example, the "custom string" is a string literal, it might be preferable to use the form that accepts that String directly instead of the one expecting a Supplier, especially if the null case is expected to be common.

Other JDK Uses of Supplier for Deferred Execution

The JDK examples of using Supplier to defer execution shown in this post were selected because of their relatively frequent use and because they are straightforward examples of how Supplier enables deferred execution or lazy evaluation in cases where no argument needs to be provided and one result needs to be supplied. Other uses of Supplier for deferred execution can be found in the java.util.concurrent concurrency package and in the JDK 8 streams-supporting java.util.streams package. Perhaps the best way to quickly identify the JDK's use of Supplier and its primitive-oriented counterparts is to look at the Javadoc-generated HTML representation of each classes "uses": Uses of Supplier, Uses of DoubleSupplier, Uses of IntSupplier, and Uses of LongSupplier. Interestingly, but not too surprisingly, there are no uses of BooleanSupplier.

The JDK-provided examples of using Supplier discussed in this post mostly demonstrate Horstmann's last listed possible reason for deferring code execution to a later point: "Running the code only when necessary." Other JDK examples of using Supplier demonstrate some of Horstmann's other listed reasons that one might want to "[execute] code later."

Suppliers Are Not Limited to the JDK

Although this post has focused on JDK use of Supplier to defer execution, its use is not limited to the JDK. As discussed earlier, Log4j 2 and other logging frameworks besides java.util.logging also provide Supplier-based logging APIs to allow deferral of construction of expensive strings for logging until it's known that the string will actually be logged. Similarly, one can use Suppliers in one's own code whenever the need arrives to specify code to be executed only when necessary and that code to be potentially exercised does not require an argument passed to it and will return (supply) a single result.

Monday, May 28, 2018

Shebang Coming to Java?

Although it was never a central goal of JEP 330 ["Launch Single-File Source-Code Programs"] to add support for the Unix-style shebang (#!), issues surrounding the potential ability of JEP 330 "single file source programs" to support a Unix-style shebang have generated significant discussion on the OpenJDK jdk-dev mailing list. This "vigorous discussion" has led to Mark Reinhold adding a week to the review period (now ends on May 31) for JEP 330 to allow for further discussion regarding targeting JEP 330 for JDK 11.

Although there are still some disagreements about whether shebang support should be added at all, it does seem that consensus is shifting to a proposal to explicitly differentiate between regular platform-independent Java source code files (those that end with extension .java) and the new JEP 330 "executable" platform-specific "single-file source-code programs". The explicit distinction is noteworthy because it would allow for shebang to be expressed in the latter (JEP 330 executable platform-specific single-file source-code programs) and not be used in the former (traditional Java platform-independent source code we're all accustomed to).

A Jonathan Giles message in this discussion spells out "various reasons to not want to change JLS or javac", points out that "shebang scripts are an executable format defined on some, but not all, platforms," points out that "creating a shebang script is typically more than just adding an initial first line to a file," and articulates the concept of differentiating explicitly between traditional Java source code and JEP 330 executable Java scripts:

While renaming the file to a command-friendly name is optional, it is also expected to be common practice. For example, a source file named `HelloWorld.java` might be installed as `helloworld`. And, while the JEP describes use cases for executing a small single-file program with `java HelloWorld.java` or executing it as a platform-specific shebang script with just `helloworld`, it does not seem like there is a common use case to execute `HelloWorld.java`. So, if the shebang script is typically renamed to a command-friendly name, it will not be possible to compile it directly, with "javac helloworld", because that is not a valid command line for javac. This reduces any potential convenience of having javac ignore shebang lines.

Since Java source files are different artifacts to platform-specific executable scripts, it makes sense to treat them differently, and since we do not want to change the Java language to support shebang lines, the suggestion is to amend the JEP and implementation so that shebang lines are never stripped from Java source files, i.e. files ending in `.java`. This avoids the problem of having the ecosystem of tools handling Java source files having to deal with arbitrary artifacts like shebang lines. The change would still permit the direct execution of Java source files, such as `java HelloWorld.java`, and the execution of shebang scripts, such as `helloworld`.

The following table summarizes characteristics and advantages associated with each style of "Java" file.

Item Traditional Java Source Files JEP 330 Executable Single-File Source-Code Programs
Descriptions
/Names
"Java source files (which end with a .java extension)" "executable scripts (which do not use [.java] extension.)"
"Java source files" "shebang scripts"
"Java source file" "script that contains Java code" or "platform-specific executable script"
"Java source files, as identified by a filename ending in '.java'"  
Shebang Not Supported Supported
Platform Independent Dependent
Explicit Compilation Yes No

Jonathan Gibbons summarizes the intent of JEP 330: "The general theme here is not to evolve Java into a scripting language, but to make tools like the Java launcher more friendly to supporting the use of Java source code in an executable text file, in order to reduce the ceremony of running simple programs."

The discussion has also covered alternative approaches such as binfmt_misc (see here also), Unix-style "here documents" (here documents defined), "support for '-' STDIN source in java launcher", and Linux being changed to support "la-la-bang: //!.

Another interesting side note from this discussion is Brian Goetz's "retrace" of how JEP 330 got to its current state. He talks about the "countless hours listening to people's concerns about Java" that led to this realization, "A general theme that people have expressed concern over is 'activation energy'; that doing simple things in Java requires too much fixed work." Goetz points out that JShell and JEP 330 are two of many possible ways of addressing this and that these two approaches were selected from among the many after making "subjective choices about which had the best impact" with consideration of "cost (in multiple dimensions) and benefit (or our subjective estimates of the benefits) when making these choices."

So, "regular Java" source code files will not be getting shebang support, but that is not a big deal as they don't really need them. It is looking likely, however, that JEP 330-based platform-dependent executable single-file scripts written in Java will support an optional shebang on the first line. We may know by Thursday of this week whether JEP 330 will be targeted for JDK 11.

Saturday, May 26, 2018

Java's String.format Can Be Statically Imported

JDK-8203630 ["Add instance method equivalents for String::format"] postulates that "the argument for implementing String::format as static appears to be that the format methods could be imported statically and thus conduct themselves comparably to C's sprintf." On a StackOverflow.com thread on the subject, Brandon Yarbrough writes, "by making the method static you can use format in a way that's very familiar and clean-looking to C programmers used to printf()." Yarbrough provides a code example and then concludes, "By using static imports, printfs look almost exactly like they do in C. Awesome!"

When I read in JDK-8203630 about this, I wondered to myself why I had not statically imported String.format when I've used it because it seems obvious to me now to do that. In this post, I look briefly at some personal theories I have considered to explain why I (and many others apparently) have not thought to statically import String.format consistently.

When static imports were introduced with J2SE 5, the new documentation on the feature presented the question, "So when should you use static import?" It answered its own question with an emphasized (I did NOT add the bold), "Very sparingly!" That paragraph then goes on to provide more details about appropriate and inappropriate uses of static imports and the negative consequences of overuse of static imports.

Although the original documentation warned emphatically about the overuse of static imports, their use did seem to increase gradually as developers became more used to them. In 2012, I asked, via blog post, "Are Static Imports Becoming Increasingly Accepted in Java?" I felt at that time that they were becoming increasingly accepted, especially when used in unit testing contexts and in more modern libraries and frameworks focusing on providing "fluent" APIs. Still, somehow, I did not think to consistently apply static imports to my uses of String.format.

I don't use String.format very often, so I thought that perhaps I just didn't get many opportunities to think about this. But, even in my relatively few uses of it, I don't recall ever importing it statically. As I've thought about this more, I've realized that the primary reason I probably don't think about statically importing String.format is the same reason that most developers have not thought about it: most of the popular and readily available online examples of how to use String.format do not use static imports!

When writing a blog or article covering a feature, especially if it's at an introductory level, it can be useful to NOT do things like import statically because the explicitly spelling out of the class name can improve the developer's ability to understand where the methods in the code come from. However, this also means that if a given developer reads numerous articles and posts and none of them show use of static imports, it is easy for that developer to use the API as shown in all those examples without thinking about the possibility of statically importing.

The following are some introductory posts regarding use of String.format. At the time of this writing, they do not demonstrate use of String.format via static import. I want to emphasize that this does not take away from the quality of these resources; if fact, some of them are excellent. This is instead intended as evidence explaining why String.format seems to be seldom statically imported in Java code.

Many of the examples in the above posts use String.format() to generate a String that is assigned to a local variable. In this context, the static import is arguably less valuable than when it is used to format a String within a greater line of code. For example, it is more "fluent" to statically import String.format() so that simply format() can be specified when that formatting takes place in a line of code doing other things beyond simply assigning the formatted string to a local variable.

The main purpose of this blog post is to point out/remind us that we can statically import String.format when doing so makes our code more readable. However, there were some other interesting points made in the short discussion on the OpenJDK core-libs-dev mailing list on this subject that I'll briefly point out here:

  • JDK-8203630 points out how an instance method might make for arguably more readable code in some cases with this example: "This result is %d".format(result);
  • Rémi Forax points out some arguments against adding an instance format method to String:
    • Issues associated with static and instance methods sharing the same name in a class.
      • John Rose adds, "Refactoring static as non-static methods, or vice versa, is a very reasonable design move, but the language makes it hard to do this and retain backward compatibility."
    • Relative slowness of Java's current string interpolation capabilities with provided by String.format
    • Potential of StringConcatFactory for future faster Java string interpolation (see "String concatenation in Java 9 (part 1): Untangling invokeDynamic" for more details on StringConcatFactory).

Whether or not instance format methods come to Java's String, reading about JDK-8203444, JDK-8203630, and the associated mailing list discussion have provided me with some things to think about. If nothing else, I'll definitely be more apt to weigh String.format's performance when considering using it and will be more likely to statically import it when I do use it.

Tuesday, May 22, 2018

JEP 329 and JEP 330 Proposed for JDK 11

This past week, two Mark Reinhold messages (here and here) on the OpenJDK jdk-dev mailing list proposed two new JEPs for inclusion with JDK 11: JEP 329 ["ChaCha20 and Poly1305 Cryptographic Algorithms"] and JEP 330 ["Launch Single-File Source-Code Programs"]. I am excited about JEP 330, but that enthusiasm led me to blog on it when it was but a mere "draft" JEP (not even assigned to the 330 number at that point). The focus of the remainder of this post will therefore be on JEP 329.

The intent of JEP 329 is succinctly described in the JEP's "Summary" section: "Implement the ChaCha20 and ChaCha20-Poly1305 ciphers as specified in RFC 7539." That same "Summary" section also states, "ChaCha20 is a relatively new stream cipher that can replace the older, insecure RC4 stream cipher."

The RC4 (Rivest Cipher 4) stream cipher has already been disabled in major web browsers (early 2016) due to security risks:

The "Motivation" section of JEP 329 currently states:

The only other widely adopted stream cipher, RC4, has long been deemed insecure. The industry consensus is that ChaCha20-Poly1305 is secure at this point in time, and it has seen fairly wide adoption across TLS implementations as well as in other cryptographic protocols. The JDK needs to be on par with other cryptographic toolkits and TLS implementations.

It is worth noting this important caveat mentioned in JEP 329's "Non-Goals" section: "TLS cipher suite support will not be part of this JEP. TLS support for these ciphers will be part of a follow-on enhancement." For additional details, see JDK-8140466 : ChaCha20 and Poly1305 Cipher Suites.

The "Dependencies" section of JEP 329 states that its only dependency is on the "constant-time math APIs" embodied in JEP 324 (see my previous post for additional overview details).

JDK-8198925 : ChaCha20 and ChaCha20-Poly1305 Cipher Implementations provides additional and even lower-level details than JEP 329. For example, it provides the specification of the new class javax.crypto.spec.ChaCha20ParameterSpec and its methods.

As of this writing, there are currently 8 JEPs targeted for JDK 11 and the 2 additional JEPs highlighted in this post's title are now proposed to target JDK 11, bringing the total number of JEPs targeted or likely to be targeted to JDK 11 to ten.