Friday, September 17, 2021

Java's Optional Does Not Supplant Traditional if-null-else or if-not-null-else Checks

Java's addition of java.util.Optional has been welcome and had led to more fluent code for methods that cannot always return non-null values. Unfortunately, Optional has been abused and one type of abuse has been overuse. I occasionally have run across code that makes use of Optional when there is no clear advantage over using null directly.

A red flag that can tip off when Optional is being used for no advantage over checking for null directly is when calling code employs Optional.ofNullable(T) against the returned value from the method it has just invoked. As with all "red flags," this doesn't mean it's necessarily a bad thing to pass the returned value from a method to Optional.ofNullable(T) (in fact, it is necessary to pass to APIs expecting Optional), but it is common for this approach to be used to provide no real value over using the returned value directly and checking it for null.

Before Optional was available, code to check for null returned from a method and acting one way for null response and another way for non-null response is shown next (all code snippets in this post are available on GitHub).

/**
 * Demonstrates approach to conditional based on {@code null} or
 * not {@code null} that is traditional pre-{@link Optional} approach.
 */
public void demonstrateWithoutOptional()
{
    final Object returnObject = methodPotentiallyReturningNull();
    if (returnObject == null)
    {
        out.println("The returned Object is null.");
    }
    else
    {
        out.println("The returned object is NOT null: " + returnObject);
        // code processing non-null return object goes here ...
    }
}

For this basic conditional, it's rarely necessary to involve Optional. The next code snippet is representative of the type of code I've occassionally seen when the developer is trying to replace the explicit null detection with use of Optional:

/**
 * Demonstrates using {@link Optional} in exactly the manner {@code null}
 * is often used (conditional on whether the returned value is empty or
 * not versus on whether the returned value is {@code null} or not).
 */
public void demonstrateOptionalUsedLikeNullUsed()
{
    final Optional<Object> optionalReturn
       = Optional.ofNullable(methodPotentiallyReturningNull());
    if (optionalReturn.isEmpty())
    {
        out.println("The returned Object is empty.");
    }
    else
    {
        out.println("The returned Object is NOT empty: " + optionalReturn);
        // code processing non-null return object goes here ...
    }
}

The paradigm in this code is essentially the same as the traditional null-checking code, but uses Optional.isEmpty() to perform the same check. This approach does not add any readability or other advantage but does come at a small cost of an additional object instantiation and method call.

A variation of the above use of Optional is to use its ifPresent(Consumer) method in conjunction with its isEmpty() method to form the same basic logic of doing one thing if the returned value is present and another thing if the returned value is empty. This is demonstrated in the following code.

/**
 * Demonstrates using {@link Optional} methods {@link Optional#ifPresent(Consumer)}
 * and {@link Optional#isEmpty()} in similar manner to traditional condition based
 * on {@code null} or not {@code null}.
 */
public void demonstrateOptionalIfPresentAndIsEmpty()
{
    final Optional<Object> optionalReturn
       = Optional.ofNullable(methodPotentiallyReturningNull());
    optionalReturn.ifPresent(
       (it) -> out.println("The returned Object is NOT empty: " + it));
    if (optionalReturn.isEmpty())
    {
        out.println("The returned object is empty.");
    }
}

This code appears bit shorter than the traditional approach of checking the returned value directly for null, but still comes at the cost of an extra object instantiation and requires two method invocations. Further, it just feels a bit weird to check first if the Optional is present and then immediately follow that with a check if it's empty. Also, if the logic that needed to be performed was more complicated than writing out a message to standard output, this approach becomes less wieldy.

Conclusion

Code that handles a method's return value and needs to do one thing if the returned value is null and do another thing if the returned value is not null will rarely enjoy any advantage of wrapping that returned value in Optional simply to check whether it's present or empty. The wrapping of the method's returned value in an Optional is likely only worth the costs if that Optional is used within fluent chaining or APIs that work with Optional.

Wednesday, September 15, 2021

The Case of the Missing JEPs

The JDK Enhancement-Proposal (JEP) process is "for collecting, reviewing, sorting, and recording the results of proposals for enhancements to the JDK and for related efforts, such as process and infrastructure improvements." JEP 0 is the "JEP Index" of "all JDK Enhancement Proposals, known as JEPs." This post provides a brief overview of current JDK Enhancement Proposals and discusses the surprisingly mysterious disappearance of two JEPs (JEP 187 and JEP 145).

JDK Enhancement Proposal Overview

The JEPs in the JEP Index with single-digit numbers are "Process" type JEPs and are currently:

The JEPs in the JEP Index with two-digit numbers are "Informational" type JEPs are are currently:

The remainder of the listed JEPs (with three-digit numbers) in the JEP Index are "Feature" type JEPs and currently range in number from JEP-101 ("Generalized Target-Type Inference") through JEP 418 ("Internet-Address Resolution SPI") (new candidate JEP as of this month [September 2021]).

Finally, there are some JEPs that do not yet have JEP numbers and which are shown in the under the heading "Draft and submitted JEPs" The JEPs in this state do not yet have their own JEP numbers, but instead are listed with a number in the JDK Bug System (JBS).

Originally, a JEP could exist in one of several different "JEP 1 Process States":

  • Draft
  • Posted
  • Submitted
  • Candidate
  • Funded
  • Completed
  • Withdrawn
  • Rejected
  • Active

The explanation of evolved potential JEP states is described in "JEP draft: JEP 2.0, draft 2." This document has a "Workflow" section that states that the "revised JEP Process has the following states and transitions for Feature and Infrastructure JEPs" and shows a useful graphic of these workflows. This document also describes the states of a Feature JEP:

  • Draft
  • Submitted
  • Candidate
  • Proposed to Target
  • Targeted
  • Integrated
  • Complete
  • Closed/Delivered
  • Closed/Rejected
  • Proposed to Drop

Neither these documented states for Feature JEPs nor the additional text that describes these state transitions describes a JEP with a JEP number (rather than a JBS number) being completely removed and this is what makes the disappearance of JEP 187 ("Serialization 2.0") and JEP 145 ("Cache Compiled Code") unexpected.

 

The Disappearance of JEP 187 ("Serialization 2.0")

JEP 187 is not listed in the JEP Index, but we have the following evidence that it did exist at one time:

It's surprisingly difficult to find any explanation for what happened to JEP 187. Unlike fellow serialization-related JEP 154 ("Remove Serialization") which has been moved to status "Closed / Withdrawn", JEP 187 appears to have been removed completely rather than being present with a "Closed / Withdrawn" status or "Closed / Rejected" status. Adding to the suspicious circumstances surrounding JEP 187, two requests on OpenJDK mailing lists regarding the state of this JEP (14 December 2014 on core-libs-dev and 6 September 2021 on jdk-dev) have so far gone unanswered.

The reasons for the complete disappearance of JEP 187 can be insuated from reading the "exploratory document" titled "Towards Better Serialization" (June 2019). I also previously touched on this in my post "JDK 11: Beginning of the End for Java Serialization?"

 

The Disapperance of JEP 145 ("Cache Compiled Code")

Like JEP 187, JEP-145 is not listed in the JEP Index, but there is evidence that it did exist at one time:

Also similarly to JEP 187, it is surprisingly difficult to find explanations for the removal of JEP 145. There is a StackOverflow question about its fate, but the responses are mostly speculative (but possible).

The most prevalent speculation regarding the disappearance of JEP 145 is that it is not needed due to Ahead-of-Time (AOT) compilation.

 

Conclusion

It seems that both JEP 187 ("Serialization 2.0") and JEP 145 ("Cache Compiled Code") have both been rendered obsolete by changing developments, but it is surprising that they've vanished completely from the JEP Index rather than being left intact with a closed or withdrawn state.

Thursday, September 9, 2021

Surprisingly High Cost of Java Variables with Capitalized Names

I've read hundreds of thousands or perhaps even millions of lines of Java code during my career as I've worked with my projects' baselines; read code from open source libraries I use; and read code examples in blogs, articles, and books. I've seen numerous different conventions and styles represented in the wide variety of Java code that I've read. However, in the vast majority of cases, the Java developers have used capitalized identifiers for classes, enums and other types and used camelcase identifiers beginning with a lowercase letter for local and other types of variables (fields used as constants and static fields have sometimes had differening naming conventions). Therefore, I was really surprised recently when I was reading some Java code (not in my current project's baseline thankfully) in which the author of the code had capitalized both the types and the identifiers of the local variables used in that code. What surprised me most is how difficult this small change in approach made reading and mentally parsing that otherwise simple code.

The following is a represenative example of the style of Java code that I was so surprised to run across:

Code Listing for DuplicateIdentifiersDemo.java

package dustin.examples.sharednames;

import java.util.Date;
import java.util.List;
import java.util.concurrent.TimeUnit;

import static java.lang.System.out;

/**
 * Demonstrates ability to name variable exactly the same as type,
 * despite this being a really, really, really bad idea.
 */
public class DuplicateIdentifiersDemo
{
    /** "Time now" at instantiation, measured in milliseconds. */
    private final static long timeNowMs = new Date().getTime();

    /** Five consecutive daily instances of {@link Date}. */
    private final static List<Date> Dates = List.of(
            new Date(timeNowMs - TimeUnit.DAYS.toMillis(1)),
            new Date(timeNowMs),
            new Date(timeNowMs + TimeUnit.DAYS.toMillis(1)),
            new Date(timeNowMs + TimeUnit.DAYS.toMillis(2)),
            new Date(timeNowMs + TimeUnit.DAYS.toMillis(3)));

    public static void main(final String[] arguments)
    {
        String String;
        final Date DateNow = new Date(timeNowMs);
        for (final Date Date : Dates)
        {
            if (Date.before(DateNow))
            {
                String = "past";
            }
            else if (Date.after(DateNow))
            {
                String = "future";
            }
            else
            {
                String = "present";
            }
            out.println("Date " + Date + " is the " + String + ".");
        }
    }
}

The code I encountered was only slightly more complicated than that shown above, but it was more painful for me to mentally parse than it should have been because of the naming of the local variables with the exact same names as their respective types. I realized that my years of reading and mentally parsing Java code have led me to intuitively initially think of identifiers beginning with a lowercase letter as variable names and identifiers beginning with an uppercase letter as being type identifiers. Although this type of instinctive assumption generally allows me to more quickly read code and figure out what it does, the assumption in this case was hindering me as I had to put special effort into not allowing myself to think of some occurrences of "String" and "Date" as variables names and occurrences as class names.

Although the code shown above is relatively simple code, the unusual naming convention for the variable names makes it more difficult than it should be, especially for experienced Java developers who have learned to quickly size up code by taking advantage of well-known and generally accepted coding conventions.

The Java Tutorials section on "Java Language Keywords" provides the "list of keywords in the Java programming language" and points out that "you cannot use any of [the listed keywords] as identifiers in your programs." It also mentions that literals (but not keywords) true, false, and null also cannot be used as identifiers. Note that this list of keywords includes the primitive types such as boolean and int, but does not include identifiers of reference types such as String, Boolean, and Integer.

Because very close to all Java code that I had read previously used lowercase first letters for non-constant, non-static variable names, I wondered if that convention is mentioned in the Java Tutorial section on naming variables. It is. That "Variables" section states: "Every programming language has its own set of rules and conventions for the kinds of names that you're allowed to use, and the Java programming language is no different. ... If the name you choose consists of only one word, spell that word in all lowercase letters. If it consists of more than one word, capitalize the first letter of each subsequent word. The names gearRatio and currentGear are prime examples of this convention."

Conclusion

I've long been a believer in conventions that allow for more efficient reading and mental parsing of code. Running into this code with capitalized first letters for its camelcase variable name identifiers reminded me of this and has led me to believe that the greater the general acceptance of a convention for a particular language, the more damaging it is to readability to veer from that convention.