Tuesday, May 14, 2019

Java Text Blocks

In the 13 May 2019 post "RFR: Multi-line String Literal (Preview) JEP [EG Draft]" on the OpenJDK amber-spec-experts mailing list, Jim Laskey announced a draft feature JEP named "Text Blocks (Preview)" (JDK-8222530).

Laskey's post opens with (I've added the links), "After some significant tweaks, reopening the JEP for review" and he is referring to the draft JEP that was started after the closing/withdrawing of JEP 326 ["Raw String Literals (Preview)"] (JDK-8196004). Laskey explains the most recent change to the draft JEP, "The most significant change is the renaming to Text Blocks (I'm sure it will devolve over time Text Literals or just Texts.) This is primarily to reflect the two-dimensionality of the new literal, whereas String literals are one-dimensional." This post-"raw string literals" draft JEP previously referred to "multi-line string literals" and now refers to "text blocks."

The draft JEP "Text Blocks (Preview)" provides detailed overview of the proposed preview feature. Its "Summary" section states:

Add text blocks to the Java language. A text block is a multi-line string literal that avoids the need for most escape sequences, automatically formats the string in predictable ways, and gives the developer control over format when desired. This will be a preview language feature.

This is a follow-on effort to explorations begun in JEP 326, Raw String Literals (Preview).

The draft JEP currently lists three "Goals" of the JEP and I've reproduced the first two here:

  1. "Simplify the task of writing Java programs by making it easy to express strings that span several lines of source code, while avoiding escape sequences in common cases."
  2. "Enhance the readability of strings in Java programs that denote code written in non-Java languages."

The "Non-Goals" of this draft JEP are also interesting and the two current non-goals are reproduced here:

  1. "It is not a goal to define a new reference type (distinct from java.lang.String) for the strings expressed by any new construct."
  2. "It is not a goal to define new operators (distinct from +) that take String operands."

The current "Description" of the draft JEP states:

A text block is a new kind of literal in the Java language. It may be used to denote a string anywhere that a string literal may be used, but offers greater expressiveness and less accidental complexity.

A text block consists of zero or more content characters, enclosed by opening and closing delimiters.

The draft JEP describes use of "fat delimiters" ("three double quote characters": ===) in the opening delimiter and closing delimiter that mark the beginning and ending of a "text block." As currently proposed, the text block actually begins on the line following the line terminator of the line with the opening delimiter (which might include spaces). The content of the text block ends with the final character before the closing delimiter.

The draft JEP describes "text block" treatment of some special characters. It states, 'The content may include " characters directly, unlike the characters in a string literal.' It also states that \" and \n are "permitted, but not necessary or recommended" in a text block. There is a section of this draft JEP that shows examples of "ill-formed text blocks."

There are numerous implementation details covered in the draft JEP. These include "compile-time processing" of line terminators ("normalized" to "to LF (\u000A)"), incidental white space (differentiation of "incidental white space from essential white space" and use of String::indent for custom indentation management), and escape sequences ("any escape sequences in the content are interpreted" per Java Language Specification and use of String::translateEscapes for custom escape processing).

Newly named "Java Text Blocks" look well-suited for the stated goals and the current proposal is the result of significant engineering effort. The draft JEP is approachable and worth reading for many details I did not cover here. Because this is still a draft JEP, it has not been proposed as a candidate JEP yet and has not been targeted to any specific Java release.

6 comments:

micha berger said...

What about indentation? Will a text block be

1- literal, and thus mess up the clean code layout of the actual java, or

2- will there be some rule for ignoring whitespace that lines up with that of the line the fat-quote is on, thereby introducing indentation sensitivity to the language?
Both sound bad.

@DustinMarx said...

The Mark Reinhold post "New candidate JEP: 355: Text Blocks (Preview)" announces that the Text Blocks (Preview) JEP is now Candidate JEP 355 (https://openjdk.java.net/jeps/355).

@DustinMarx said...

Jim Laskey has posted core review requests for three new instance methods being proposed for addition to the java.lang.String class in support of Text Blocks (Preview): String::formatted (JDK-8203444), String::stripIndent (JDK-8223775), and String::translateEscapes (JDK-8223780).

@DustinMarx said...

JEP 355 ["Text Blocks (Preview)"] is now proposed to target JDK 13.

Unknown said...

Looks neat. It is a bit unclear to me if this is just code syntax and it compiles to a form that older JVMs can process or will only be forward JVM compatible. I assume the former and I missed it. Hope it can be confirmed.

@DustinMarx said...

JDK 13 Early Access Build #25 includes the addition of at least three static methods related to "Java Text Blocks": JDK-8223775
[String::stripIndent (Preview)], JDK-8203444 [String::formatted (Preview)], and JDK-8223780 [String::translateEscapes (Preview)]. It also includes JDK-8223967 [Compiler Support for JEP 355: Text Blocks (Preview)].