Monday, November 20, 2017

Simple String Representation of Java Decimal Numbers without Scientific Notation

The primary types/objects used for decimal numbers in Java are float/Float, double/Double, and BigDecimal. Each of these has cases in which its "default" string representation is "computerized scientific notation." This post demonstrates some simple approaches to provide a string representation of the decimal number in these cases without scientific notation.

Examples in this post will demonstrate the "default" scientific notation String representations of these Java numeric types using a range of numbers for each type that demonstrate approximately where the "default" representation for each type becomes scientific notation. The next three code listings show the code for constructing general ranges for floats, doubles, and BigDecimals. The full source code listing for these examples is available on GitHub.

Constructing the Example Range of Floats

/**
 * Writes floats in the provided format and in the
 * provided range to standard output.
 *
 * @param start Float to start writing.
 * @param threshold Float past which to not write anymore.
 * @param delta Delta for each increment of floats to be written.
 * @param label Label for header.
 * @param format Format for print out.
 */
private static void writeFloatsToOutput(
   final float start,
   final float threshold,
   final float delta,
   final String label,
   final Format format)
{
   out.println(generateHeader(label));
   float floatValue = start;
   do
   {
      out.println("= " + format.fromFloat(floatValue));
      floatValue += delta;
   }
   while (floatValue < threshold);
}

Constructing the Example Range of Doubles

/**
 * Writes doubles in the provided format and in the
 * provided range to standard output.
 *
 * @param start Double to start writing.
 * @param threshold Double past which to not write anymore.
 * @param delta Delta for each increment of doubles to be written.
 * @param label Label for header.
 * @param format Format for print out.
 */
private static void writeDoublesToOutput(
   final double start,
   final double threshold,
   final double delta,
   final String label,
   final Format format)
{
   out.println(generateHeader(label));
   double doubleValue = start;
   do
   {
      out.println("= " + format.fromDouble(doubleValue));
      doubleValue += delta;
   }
   while (doubleValue < threshold);
}

Constructing the Example Range of BigDecimals

/**
 * Writes BigDecimals in the provided format and in the
 * provided range to standard output.
 *
 * @param start BigDecimal to start writing.
 * @param threshold BigDecimal past which to not write anymore.
 * @param delta Delta for each increment of BigDecimals to be written.
 * @param label Label for header.
 * @param format Format for print out.
 */
private static void writeBigDecimalsToOutput(
   final BigDecimal start,
   final BigDecimal threshold,
   final BigDecimal delta,
   final String label,
   final Format format)
{
   out.println(generateHeader(label));
   BigDecimal decimal = start;
   do
   {
      out.println("= " + format.fromBigDecimal(decimal));
      decimal = decimal.add(delta);
   }
   while (decimal.compareTo(threshold) < 0);
}

The three methods shown above can be called with ranges specified to demonstrate when scientific notation is automatically employed for String representations of the Java decimal types. The output from running the above with "default" format for each numeric type is shown in the next three output listings.

The default representation of very small and very large floats does include scientific notation for the smallest numbers shown and for the largest numbers shown. These numbers demonstrate what is discussed in the Float.toString(Float) documentation: numbers "less than 10-3 or greater than or equal to 107" are "represented in so-called 'computerized scientific notation.'"

==========================
= Small Floats (DEFAULT) =
==========================
= 8.5E-4
= 9.5E-4
= 0.00105
= 0.0011499999
= 0.0012499999
= 0.0013499998
= 0.0014499997
= 0.0015499997
= 0.0016499996
= 0.0017499996
= 0.0018499995
= 0.0019499995
==========================
= Large Floats (DEFAULT) =
==========================
= 9999995.0
= 9999996.0
= 9999997.0
= 9999998.0
= 9999999.0
= 1.0E7
= 1.0000001E7
= 1.0000002E7
= 1.0000003E7
= 1.0000004E7

The default representation of very small and very large doubles does include scientific notation for the smallest numbers shown and for the largest numbers shown. These numbers demonstrate what is discussed in the Javadoc documentation for Double.toString(double): numbers "less than 10-3 or greater than or equal to 107" are "represented in so-called 'computerized scientific notation.'"

===========================
= Small Doubles (DEFAULT) =
===========================
= 8.5E-4
= 9.5E-4
= 0.00105
= 0.00115
= 0.00125
= 0.00135
= 0.0014500000000000001
= 0.0015500000000000002
= 0.0016500000000000002
= 0.0017500000000000003
= 0.0018500000000000003
= 0.0019500000000000003
===========================
= Large Doubles (DEFAULT) =
===========================
= 9999995.0
= 9999996.0
= 9999997.0
= 9999998.0
= 9999999.0
= 1.0E7
= 1.0000001E7
= 1.0000002E7
= 1.0000003E7
= 1.0000004E7

While float and double had their smallest and largest numbers expressed in scientific notation, BigDecimal only does this by default for smaller numbers. This is described in the BigDecimal.toString() Javadoc documentation: "If the scale is greater than or equal to zero and the adjusted exponent is greater than or equal to -6, the number will be converted to a character form without using exponential notation. ... if ... the adjusted exponent is less than -6, the number will be converted to a character form using exponential notation."

===============================
= Small BigDecimals (DEFAULT) =
===============================
= 8.5E-7
= 9.5E-7
= 0.00000105
= 0.00000115
= 0.00000125
= 0.00000135
= 0.00000145
= 0.00000155
= 0.00000165
= 0.00000175
= 0.00000185
= 0.00000195
===============================
= Large BigDecimals (DEFAULT) =
===============================
= 99999950000000000000000000000000000000000000000000
= 99999960000000000000000000000000000000000000000000
= 99999970000000000000000000000000000000000000000000
= 99999980000000000000000000000000000000000000000000
= 99999990000000000000000000000000000000000000000000
= 100000000000000000000000000000000000000000000000000
= 100000010000000000000000000000000000000000000000000
= 100000020000000000000000000000000000000000000000000
= 100000030000000000000000000000000000000000000000000
= 100000040000000000000000000000000000000000000000000
private static void writeFormattedValues(final Format format)
{
   writeFloatsToOutput(
      0.00085f, 0.002f, 0.0001f, "Small Floats (" + format + ")", format);
   writeFloatsToOutput(
      9_999_995f, 10_000_005f, 1f, "Large Floats (" + format + ")", format);

   writeDoublesToOutput(
      0.00085d, 0.002d, 0.0001d, "Small Doubles (" + format + ")", format);
   writeDoublesToOutput(
      9_999_995d, 10_000_005d, 1d, "Large Doubles (" + format + ")", format);

   writeBigDecimalsToOutput(
      new BigDecimal("0.00000085"),
      new BigDecimal("0.000002"),
      new BigDecimal("0.0000001"),
      "Small BigDecimals (" + format + ")",
      format);
   writeBigDecimalsToOutput(
      new BigDecimal("99999950000000000000000000000000000000000000000000"),
      new BigDecimal("100000050000000000000000000000000000000000000000000"),
      new BigDecimal("10000000000000000000000000000000000000000000"),
      "Large BigDecimals (" + format + ")",
      format);
}

The representation of very small and very large numbers in the code above can be presented in default format or in a format the precludes use of scientific notation. The code listing for the Format enum is shown next and this enum demonstrates approaches that can be used with float, double, and BigDecimal to render them without scientific notation.

Format.java

/**
 * Supports rendering of Java numeric types float, double,
 * and BigDecimal in "default" format and in format that
 * avoids use of scientific notation.
 */
public enum Format
{
   DEFAULT
   {
      @Override
      public String fromFloat(final float floatValue)
      {
         return String.valueOf(floatValue);
      }
      @Override
      public String fromDouble(final double doubleValue)
      {
         return String.valueOf(doubleValue);
      }
      @Override
      public String fromBigDecimal(final BigDecimal bigDecimalValue)
      {
         return bigDecimalValue.toString();
      }
   },
   NO_EXPONENT
   {
      @Override
      public String fromFloat(final float floatValue)
      {
         return numberFormat.format(floatValue);
      }
      @Override
      public String fromDouble(final double doubleValue)
      {
         return numberFormat.format(doubleValue);
      }
      @Override
      public String fromBigDecimal(final BigDecimal bigDecimalValue)
      {
         return bigDecimalValue.toPlainString();
      }
   };

   private static final NumberFormat numberFormat = NumberFormat.getInstance();

   static
   {
      numberFormat.setMaximumFractionDigits(Integer.MAX_VALUE);
      numberFormat.setGroupingUsed(false);
   }

   public abstract String fromFloat(final float floatValue);
   public abstract String fromDouble(final double doubleValue);
   public abstract String fromBigDecimal(final BigDecimal bigDecimalValue);
}

The Format enum uses an instance of NumberFormat with grouping disabled and with the maximum fraction digits set to Integer.MAX_VALUE to ensure that floats and doubles are rendered without scientific notation. It's even easier to accomplish this with BigDecimal using its toPlainString() method.

The output from running the code with the Format.NO_EXPONENT is shown next (and there's no exponents or scientific notation in sight).

==============================
= Small Floats (NO_EXPONENT) =
==============================
= 0.0008500000112690032
= 0.0009500000160187483
= 0.0010499999625608325
= 0.0011499999091029167
= 0.001249999855645001
= 0.0013499998021870852
= 0.0014499997487291694
= 0.0015499996952712536
= 0.0016499996418133378
= 0.001749999588355422
= 0.0018499995348975062
= 0.0019499994814395905
==============================
= Large Floats (NO_EXPONENT) =
==============================
= 9999995
= 9999996
= 9999997
= 9999998
= 9999999
= 10000000
= 10000001
= 10000002
= 10000003
= 10000004
===============================
= Small Doubles (NO_EXPONENT) =
===============================
= 0.00085
= 0.00095
= 0.00105
= 0.00115
= 0.00125
= 0.00135
= 0.0014500000000000001
= 0.0015500000000000002
= 0.0016500000000000002
= 0.0017500000000000003
= 0.0018500000000000003
= 0.0019500000000000003
===============================
= Large Doubles (NO_EXPONENT) =
===============================
= 9999995
= 9999996
= 9999997
= 9999998
= 9999999
= 10000000
= 10000001
= 10000002
= 10000003
= 10000004
===================================
= Small BigDecimals (NO_EXPONENT) =
===================================
= 0.00000085
= 0.00000095
= 0.00000105
= 0.00000115
= 0.00000125
= 0.00000135
= 0.00000145
= 0.00000155
= 0.00000165
= 0.00000175
= 0.00000185
= 0.00000195
===================================
= Large BigDecimals (NO_EXPONENT) =
===================================
= 99999950000000000000000000000000000000000000000000
= 99999960000000000000000000000000000000000000000000
= 99999970000000000000000000000000000000000000000000
= 99999980000000000000000000000000000000000000000000
= 99999990000000000000000000000000000000000000000000
= 100000000000000000000000000000000000000000000000000
= 100000010000000000000000000000000000000000000000000
= 100000020000000000000000000000000000000000000000000
= 100000030000000000000000000000000000000000000000000
= 100000040000000000000000000000000000000000000000000

The standard Java floating types and BigDecimal class render some numbers in scientific notation, but it's easy to ensure that this default presentation of scientific notation is not used when it is not desired.

Saturday, November 4, 2017

Java Command-Line Interfaces (Part 30): Observations

This series on parsing command line arguments in Java has consisted of 29 posts published over four months and covering 28 distinct open source libraries available for parsing command line arguments in Java. This post collects some observations that can be made from the first 29 posts in this series and provides some general considerations to make when selecting one of the 28 libraries or deciding to roll one's own command-line argument parsing code. Although no one library will be the best fit for every situation, this post will also look at how some libraries may be a better fit than others for specific situations. The post will end with a subset of the original 28 libraries that may be the most generally appealing of the covered libraries based on some criteria covered in the post.

General Observations

There are several observations that can be made after looking at the 28 libraries covered in this series on parsing command line arguments in Java.

  • For most Java developers in most situations, there appears to be very little reason to write custom command line parsing code.
  • The plethora of Java-based libraries for parsing command line arguments in indicative of the vastness of the Java ecosystem.
  • The fact that all 28 covered libraries are open source is a reminder of how fundamental open source is in the Java culture.
  • There are some interesting differences between the libraries covered in this series and the various different approaches are a reminder that there's often more than one way to implement even relatively minor functionality in Java.
  • The large number of libraries for parsing command line arguments in Java, many of which are associated with author statements saying something about the existing libraries not satisfying their needs, is evidence that it's unlikely there will ever be a single language, framework, or library that will be "best" to everyone. If something as simple as a command line parsing library cannot be written to be everyone's favorite, it seems impossible to ever have a larger library, a framework, or a programming language be everyone's favorite. "One size doesn't fit all" when it comes to libraries, frameworks, and programming languages.
  • It's not just technical strength that must be considered when evaluating and selecting a library; its license, distribution mechanism, currency, provider support, and community support also all weigh in on the decision. Even the version of Java it will run on plays a role in the decision.

Evaluation Criteria

These are several criteria that may be important to a Java developer when selecting between so many libraries and when weighing whether to use a library or implement one's own command line argument functionality.

  • Is it open source?
    • My simple definition of open source in this context is "source code can be legally viewed by developers using the library." Wikipedia articulates a similar but slightly stricter definition, "[open source code] is source code [that is] made available with a license in which the copyright holder provides the rights to study, change, and distribute the software to anyone and for any purpose."
    • All 28 libraries covered in this series make source code available to developers using the library and so are "open source" by my simple definition and also generally meet the slightly stricter definition on Wikipedia.
  • What is its license?
    • The license under which each library is issued can be significant in determining whether to choose that library. Most users will be most comfortable with open source licenses that are clearly defined and that are most liberal in what they allow.
    • Many of the libraries covered in the series are released under liberal open source licenses, but some are released under less liberal licenses or do not have an explicitly specified license at all.
  • What is its size?
    • Use of a library typically means an additional JAR on the classpath and it may be important in some situations to keep the size of these additional libraries as small as possible for a particular deployment environment.
    • None of these command line parsing libraries are large when compared to libraries such as Spring and Hibernate, but the relative differences in size among these libraries can be large.
  • Are there third-party dependencies?
    • Third-party libraries add to the overall increase in library size and mean more dependencies to manage.
    • Most of the libraries covered in this series do not have additional dependencies, but some of them do.
  • What is the distribution mechanism?
    • Availability as a single JAR via Maven repository is probably the easiest mechanism for most Java developers to acquire a library.
    • There are JARs available in the Maven repository for many of the covered libraries, but some of the libraries require downloading the JAR from a project site or associated article site.
    • The 28 libraries covered in this series tend to be distributed via Maven repository, via project page download (GitHub, SourceForge, library author's site, etc.), and even copy-and-paste in a couple of cases where the "library is a single Java source code file.
  • Documentation
    • The libraries covered in this series are documented in a variety of ways including project documentation, Javadoc documentation, unit tests, and in-code comments.
    • Many of the libraries have the equivalent of a "Quick Start" tutorial, but some have relatively little documentation other than that. Some have no or very few Javadoc comments and others have significant Javadoc-based API documentation. Many of the libraries make their Javadoc-generated documentation available online, but some require downloading the library to see its Javadoc-based documentation.
  • Community
    • With open source projects, it's often advantageous to have a large community that uses the product because a large community means more implicit testing and potentially more blog posts, articles, and forum messages on how to use that project.
    • The sizes of the communities of the libraries covered in this series vary dramatically and it can be difficult to ascertain the size of any given community. However, the number of libraries dependent on a given library and the number of online resources talking about a given library give us an idea of community involvement.
  • Age of Library / Most Recent Update
    • Newer is not always better, but it generally is more compelling to use an open source product that receives current and recent updates than to use a product that has not been updated or changed in many years. It's a bit less of a concern with a small and simple library such as a command line parsing library, but currently supported libraries are still advantageous over potentially abandoned projects.
  • What features does it offer?
    • This is where the libraries covered in the series really differentiate themselves, but it's the criterion that is most difficult to compare between libraries as it really depends on which particular feature is desired.
    • Most of the covered libraries provided most of the features covered in the simple examples in this series. However, some of the libraries provided significant features that were beyond those used in each library's example.
    • For the simple examples used throughout this series, the ease of use of the API provided by the parsing library was probably as important of a feature as any.

The CLI Comparison page on the picocli GitHub page compares and contrasts many of the libraries covered in this series and some libraries not covered in this series. The page compares the libraries in table format by listing each library's respective attributes such as license, minimum Java version supported, style of API, and parsing options supported.

This series has covered 28 different libraries for parsing command line arguments from Java. It's impossible to designate any one of these as the "best" library for this purpose for all people in all situations. Each library is an investment of time and effort by its developer (or developers), but I attempt here to narrow down the list of libraries to the subset that I believe is most likely to appeal to general situations and developers.

Voted Most Likely to Succeed

The following libraries are listed in alphabetical order rather than in my order of preference.

  • Apache Commons CLI
    • In my opinion, Apache Commons CLI offers the least aesthetically appealing API of this narrowed down subset of recommended libraries.
    • Apache Commons CLI benefits from name recognition, from being frequently used by other libraries and products, and from being around for a long time.
      • In environments where it is difficult to justify installation of new libraries, there is better chance of having Apache Commons CLI already available than most of the other libraries.
    • Apache Commons CLI is built into Groovy and so is especially easy for someone to use moving between Groovy and Java.
    • Quality documentation.
    • The Apache License, Version 2, is a well-known, liberal, and corporation-friendly license.
  • args4j
    • args4j offers numerous features and is highly extensible.
    • Command-line arguments are typed.
    • Quality documentation.
    • args4j is currently supported by a familiar name in the open source Java community.
    • The MIT license is a well-known, liberal, and corporation-friendly license.
  • JCommander
    • API consists of easy-to-use combination of annotations and builders.
    • Command-line arguments are typed.
    • Quality documentation.
    • JCommander is currently supported by a familiar name in the open source Java community.
    • The Apache License, Version 2, is a well-known, liberal, and corporation-friendly license.
  • JewelCli
    • The annotated interface approach of JewelCli appeals to me.
    • Command-line arguments are typed.
    • Quality documentation.
    • The Apache License, Version 2, is a well-known, liberal, and corporation-friendly license.
  • picocli
    • Highly readable annotation-based API.
    • Quality documentation.
    • Command-line arguments are typed.
    • One of the more feature-rich libraries covered in this series.
    • Currently supported (has been enhanced with several new features since I started this series of posts).
    • The Apache License, Version 2, is a well-known, liberal, and corporation-friendly license.

Although I listed a subset of five libraries out of the 28 covered libraries, there are reasons that a developer might choose to use one of the 23 libraries not on this narrowed-down list. Several of the libraries not on this list offer unique features that, if important enough to the Java developer, would make those libraries preferable to the 5 listed above.

The next listing associates some of the covered libraries with some of their relatively unique strengths. One of these might be selected, even if it's not in the list of five I just highlighted, if it is something that it's particularly and uniquely strong in and is one of the most important considerations for the relevant application. Many of the listed "traits" are a matter of preference or taste, meaning a library having the listed trait may be seen as a positive by one developer and as a negative by another developer.

TraitDescription / BenefitLibraries with Desired Trait
Color Syntax Color syntax (select environments) picocli
Command Completion Autocompletion of commands (select environments) picocli
Configuration (Annotations) Uses annotations primarily to define command-line options. Airline 2
args4j
cli-parser
CmdOption
Commandline
google-options
jbock
JCommander
JewelCli
MarkUtils-CLI
picocli
Rop
Configuration (API) Uses programmatic APIs (traditional and/or builder) to define command-line options. Apache Commons CLI
Argparse4j
argparser
CmdLn
getopt4j
Jargo
JArgp
JArgs
JCLAP
jClap
JOpt Simple
JSAP
jw-options
parse-cmd
Configuration (Reflection) Uses reflection (but not annotations) to define command-line options. CLAJR
Configuration (XML) Uses or supports use of XML to define command-line options. JCommando
JSAP
Single File Source Enables easy inclusion of "library" in one's project as a source code file that is compiled rather than as a JAR that source is compiled against. CLAJR
picocli
Small JAR Libraries providing minimally required JAR of less than 25 KB in size (applies to version covered in this series). CLAJR
cli-parser
getopt4j
JArgp
JArgs
jClap
jw-options
Rop

There are numerous other characteristics that one might desire in a Java-based command-line parsing library that might narrow down the number of appropriate candidates. These include flexibility of command styles (long and/or short names, styles [GNU, POSIX, Java, etc.]), applicable license, availability of current support, new releases and updates, size of user community, and minimum version of Java that is supported. The tables provided in the previously referenced Java Command Line Parsers Comparison make it easy to compare some of these characteristics for most of the libraries covered in this series.

This series on parsing command line arguments with Java has demonstrated 28 libraries and there are several more publicly available libraries not yet covered in this series. With over 30 libraries available, most developers should be able to find an external library to meet one's needs.

Additional References

Thursday, October 26, 2017

Java Command-Line Interfaces (Part 29): Do-It-Yourself

This series on parsing command line arguments from Java has briefly introduced 28 open source libraries that can be used to process command-line arguments from Java code. Even with these 28 libraries covered, the series has not covered all available open source libraries for parsing command line options from Java. For example, this series has not covered docopt, dolphin getopt, DPML CLI, the "other" JArgP, java-getopt, ritopt, cli-args, clio, TE-CODE Command, and likely many other libraries I'm not aware of. This post looks at considerations one might make when attempting to decide whether to roll one's own command line argument parsing code in Java versus using one of the plethora of command line parsing libraries that is already available.

At first glance, it would be easy to say that someone developer their own command-line parsing code in Java might be suffering from Not Invented Here Syndrome. However, I still occasionally write my own simple command line processing code and will outline the situations in which I do this.

Many of the libraries covered in this series are small. However, for cases where the command line parsing is very simple, even these smaller libraries may be heavier than what is needed for the job at hand. The examples I show in this post are the type that might fit this category. The likelihood of a developer developing custom command line processing code likely increases as the complexity of required command line parsing increases and as the difficultly of introducing new libraries to one's deployment environment decreases. Process can also influence the decision as some developers may choose to implement their own command line processing code rather than wait for requisite approvals to use the identified library.

The easiest situation to choose to not use a command-line parsing library for Java is obviously those situations in which command line arguments are not necessary. In fact, it is likely that far more Java developers never or rarely use command-line options given that so many use web servers, application servers, or other containers (such as Spring) to run that they don't think about command-line parsing for their application. Even some simple command-line-based applications may be able to assume values or read values from an assumed location and don't need arguments passed to them.

If I only have a single argument to read from the command line, I'll write that simple code myself. The Java Tutorials feature a section on Command-Line Arguments that introduces basic handling of command line arguments in Java. The zero to many strings on the command line following the Java executable application's name are provided to the Java application via the String[] or String... arguments to the classic "public static void main" function. The simple code listing below indicates how a single expected command-line argument might be processed.

Parsing Single Required Argument

/**
 * Demonstrate processing a single provided argument.
 *
 * @param arguments Command-line arguments; expecting a
 *    String-based name.
 */
public static void main(final String[] arguments)
{
   if (arguments.length < 1)
   {
      out.println("\nNo name provided; please provide a name.\n");
      out.println("\tUSAGE: SingleArgMain <name>");
   }
   else
   {
      out.println("Hello " + arguments[0] + "!");
   }
}

The above code was easy to write because there was one command line option, it did not have an argument to go with the option, and it was required. With all of these assumptions in place, it is relatively easy to write command line parsing code.

If the application requires two arguments, it is still pretty straightforward to handle this directly in Java without a third-party library. This is demonstrated in the next code listing that simulates an application that accepts the name/path of an XML file to be validated and the name/path of the XSD against which that XML is to be validated.

Parsing Two Required Arguments

/**
 * Demonstrate processing two required provided arguments.
 *
 * @param arguments Command-line arguments; expecting a String-based
 *    path and file name of an XML file to be validated and a
 *    String-based path and file name of the XSD file against which
 *    the XML file will be validated.
 */
public static void main(final String...arguments)
{
   if (arguments.length < 2)
   {
      out.println("\nXML file path/name and XSD file path/name not provided.\n");
      out.println("\tUSAGE: TwoArgsMain <xmlFilePathAndName> <xsdFilePathAndName>");
   }
   else
   {
      out.println("The provided XML file is '" + arguments[0]
         + "' and the provided XSD file is '" + arguments[1] + "'.");
   }
}

In the posts in this series, I've used examples that expect a required option specifying file path/name and an optional option expressing enabled verbosity. In all of those examples, the file path/name option was a flag name (-f and/or --file) followed by a an "argument" or "value" for that option. For those examples, the verbosity option did not have an argument or value associated with it and the existence of -v or --verbose implied enabled verbosity. This is particularly easy to accomplish directory in Java without a library if I'm willing to change the approach slightly and assume the the first command line option is the file path/name and to assume that the verbosity flag, if provided, occurs after the file path/name. The other assumption that makes this easy is to assume that because the file path/name is first, I don't need to actually use a flag such as -file or -f. With all of these assumptions in place, the code example is shown next.

Series Example: Parsing One Required Option and One Optional Option

/**
 * Demonstrate parsing of command-line options for required file
 * path/name and for optional verbosity.
 *
 * @param arguments Expected command-line arguments; first String
 *    should be file path/name and, if applicable, second String
 *    should be the verbosity flag (-v or --verbose).
 */
public static void main(final String[] arguments)
{
   if (arguments.length < 1)
   {
      out.println("\nNo file path/name provided; please provide a file path/name.\n");
      out.println("\tUSAGE: SeriesExample <filePathAndName> [-v|--verbose]");
   }
   else
   {
      final String file = arguments[0];
      final String verboseString = arguments.length > 1 ? arguments[1] : "";
      final boolean verbose = verboseString.equals("-v") || verboseString.equals("--verbose");
      out.println("File path/name is '" + file + "' and verbosity is " + verbose);
   }
}

I've had relatively easy command-line parsing options so far because of these characteristics of these examples:

  • Order of command line arguments was assumed and unchangeable.
  • Never had more than one optional command line argument and the optional argument was expected last.
  • Never needed a command line argument that consisted of flag and value associated with that flag.
  • No option had a dependency on any other option.

The just-mentioned characteristics made for easier parsing of command line options from Java because the number of permutations and combinations to be prepared for were significantly reduced by requiring the ordering of the options, by not allowing for flags with associated values that must be handled together (each string in the provided String[] is independent of all other strings in that array), and by only allowing one optional argument at most (and requiring it to be last).

As the command-line arguments situation gets more complicated, my desire to use a third-party library increases. If I want to have multiple optional arguments or want to have options that consist of flags with associated values, I'm more likely to make the jump to the third-party libraries for parsing command-line arguments in Java. Using most of the third-party libraries covered in this series removes the need for me to worry about option ordering and option name/flag associations.

One situation in which it might be desirable to roll one's own command-line parsing code in Java is when those parsing needs are highly specific to a particular situation that is not handled well by the existing libraries or when none of the existing libraries adequately meet one's needs. However, with 30+ libraries available, I doubt this would occur very frequently for most people.

When developing one's own command-line parsing code in Java, other options besides writing it from scratch include forking and extending one of the open source libraries or building one's code on a framework such as that introduced in the article "Parsing Command Line Arguments with Java: Using an effective Java framework to write command line tools" (pages 20 and 22 of this Java Developer's Journal).

For small Java-based tools, the simple command-line parsing approaches shown in this post are often sufficient, especially if I'm the only one likely to use the tool. However, as the potential user base increases for the Java application, the requirements outlined in this post can become onerous and the use of third-party libraries covered in this series of posts can be helpful in creating a more user-friendly command-line argument experience. For the simplest of Java-based tools and applications, I may be able to get away with my own homemade command-line parsing code. However, for most Java applications of significance, a third-party library will make more sense because it offers significantly greater flexibility and ease of use for the end users.

Additional References

Wednesday, October 25, 2017

Java Command-Line Interfaces (Part 28): getopt4j

The page for getopt4j describes this as "a library to parse command line arguments according to the GNU style." The page then introduces getopt4j: "The 'getopt4j' library is designed to parse the command line options in the same manner as the C getopt() function in glibc (the GNU C runtime library). It attempts to do this in a simpler, more Java-centric manner than the original product." This post describes use of getopt4j to parse command line options in the same manner as was done for the libraries covered in the earlier 27 posts in this series.

The "definition" stage is accomplished in getopt4j via instances of CLOptionDescriptor as demonstrated in the next code listing (full source code is available on GitHub).

"Definition" Stage with getopt4j

final CLOptionDescriptor fileDescriptor
   = new CLOptionDescriptor("file",
      CLOptionDescriptor.ARGUMENT_REQUIRED,
      'f',
      "Path and name of file.");
final CLOptionDescriptor verboseDescriptor
   = new CLOptionDescriptor("verbose",
      CLOptionDescriptor.ARGUMENT_DISALLOWED,
      'v',
      "Is verbosity enabled?");
final CLOptionDescriptor[] optionsDefinitions
   = new CLOptionDescriptor[]{fileDescriptor, verboseDescriptor};

As shown in the above code, the instances of CLOptionDescriptor are placed in an array to be presented to the getopt4j parser.

The "parsing" stage is achieved in getopt4j via instantiation of the CLArgsParser class. The constructor of that class accepts the command line arguments in the String[] array and the array of CLOptionDescriptor instances representing the options' definitions. This is shown in the next code listing.

"Parsing" Stage with getopt4j

final CLArgsParser parser = new CLArgsParser(arguments, optionsDefinitions);

The "interrogation" stage in getopt4j is accomplished by retrieving a List<CLOption> via invocation of the method getArguments() on the CLArgsParser instance. Each instance of CLOption can be queried by its getId() method to acquire the parsed parameter by its "short" name ('f' or 'v' in this example). Once the appropriate instance of CLOption has been found via its getId() method, that same instance of CLOption will provide the value associated on the command line with that option via a call to the CLOption's method getArgument() method. This "interrogation" process is demonstrated in the next code listing.

"Interrogation" Stage with getopt4j

String filePathAndName = null;
boolean verbose = false;
final List<CLOption> options = parser.getArguments();
for (final CLOption option : options)
{
   switch(option.getId())
   {
      case 'f' :
         filePathAndName = option.getArgument();
         break;
      case 'v' :
         verbose = true;
         break;
   }
}

out.println("File path/name is '" + filePathAndName + "' and verbosity is " + verbose);

The getopt4j library makes it easy to request usage/help information by passing the array of CLOptionDescriptor instances to the static method CLUtil.describeOptions(CLOptionDescriptor[]). This is demonstrated in the next code listing, a couple of lines of code called when it is detected that the file path/name has not been provided.

"Usage" Statement with getopt4j

if (filePathAndName == null)
{
   out.println("ERROR: The file path/name option is required but was not provided.\n\n"
      + CLUtil.describeOptions(optionsDefinitions));
}

The first of the next two screen snapshots depicts the automatically generated "usage" statement that the code is able to invoke when the required "file option is not specified. The second image depicts various combinations of the "file" and "verbose" long and short option names being used.

There are characteristics of getopt4j to consider when selecting a framework or library to help with command-line parsing in Java.

The getopt4j library provides GNU C getopt()-like functionality and APIs with Java style.

Additional References

Tuesday, October 24, 2017

Java Command-Line Interfaces (Part 27): cli-parser

CLI Parser, originally hosted on and now archived on Google Code, is now available on GitHub. The archive Google Code project page describes CLI Parser as a "very simple to use, very small dependency" that uses annotations to "make very succinct main methods that don't need to know how to parse command line arguments with either fields, properties, or method based injection." The current GitHub project page describes CLI Parser as "a tiny ..., super easy to use library for parsing various kinds of command line arguments or property lists."

CLI Parser expects the "definition" stage to be implemented via the @Argument annotation. This is demonstrated in the next code listing, which provides a simple example defining "file" and "verbose" options as has been done in previous posts in this series. The complete code listing is available on GitHub.

"Definition" Stage with CLI Parser

@Argument(alias="f", description="Path/name of the file", required=true)
private String file;

@Argument(alias="v", description="Verbosity enabled?")
private boolean verbose;

The code shown above defines two options. Each option can be specified with a name matching the field name (file or verbose) or with the specified alias (f or v). With CLI Parser, either case (full field name or alias) is expressed on the command-line with a single hyphen. As shown in the code example, an option can be specified as "required" and description text can be provided to be used in help/usage statements.

The "parsing" stage is accomplished in CLI Parser via static functions on its Args class. In this case, I'm using the Args.parseOrExit(Class, String[]) function as shown in the next code listing.

"Parsing" Stage with CLI Parser

final List<String> unparsed = Args.parseOrExit(instance, arguments);

The "interrogation" stage is accomplished by accessing the fields annotated with @Argument as demonstrated in the next code listing.

"Interrogation" Stage with CLI Parser

out.println(
   "File path/name is '" + instance.file + "' and verbosity is " + instance.verbose);

The "definition" code defined the "file" option as "required." If this option is not specified on the command line, CLI Parser automatically prints out a usage statement using the "description" values provided in the respective @Argument annotations. This is shown in the next screen snapshot, which is followed by another screen snapshot indicating combinations of the -file/-f and -verbose/-v options.

There are characteristics of CLI Parser to consider when selecting a framework or library to help with command-line parsing in Java.

  • CLI Parser is open source and available under the Apache License, Version 2.
  • CLI Parser is a small, lightweight library with the cli-parser-1.1.2.jar being approximately 15 KB and having no third-party dependencies.

CLI Parser is, as advertised, a "tiny" and "super easy to use library for parsing various kinds of command line arguments." It's liberal open source Apache license makes it easy for most organizations to acquire and use it.

Additional References

Monday, October 23, 2017

Java Command-Line Interfaces (Part 26): CmdOption

I became aware of the twenty-sixth featured Java-based library in this series on parsing command line arguments because of a Tweet. CmdOption is described on its main GitHub page as "a simple annotation-driven command line parser toolkit for Java 5+ applications that is configured through annotations." The project's subtitle is, "Command line parsing has never been easier."

The annotation @CmdOption is used to annotate fields (or methods) that will contain the parsed command-line arguments. In other words, it is with the @CmdOption annotation that the "definition" stage is accomplished with CmdOption. This is shown in the next code listing.

"Definition" Stage with CmdOption

@CmdOption(names={"--file","-f"}, description="File Path/Name", minCount=1, args={"filePathAndName"})
private String file;

@CmdOption(names={"--verbose","-v"}, description="Is verbosity enabled?", maxCount=0)
private boolean verbose;

As with other posts in this series, the examples used in this post are of options specifying file path and name and a verbosity level. The full source code listing for the example code listings in this post is available on GitHub. As the above code listing shows, the "long" (with double hyphen) and "short" (with single hyphen) option names can be specified with the @CmdOption annotation's names element. The minCount element is used to specify that a particular option must have an argument passed to it and the args element lists the string reference to the argument of an option that will be rendered in the help/usage display. The maxCount element is set to 0 for the verbosity option because no arguments should be provided for that option (presence of -v or --verbose is enough).

The "parsing" stage is accomplished in CmdOption by passing an instance of the class with @CmdOption-annotated fields (or methods) to the constructor of the CmdOption's CmdlineParser class and then passing the String[] representing the command-line arguments to the parse(String[]) method of that instantiated CmdlineParser class.

"Parsing" Stage with CmdOption

final Main instance = new Main();
final CmdlineParser parser = new CmdlineParser(instance);
parser.parse(arguments);

The "interrogation" stage in CmdOption consists simply of accessing the @CmdOption-annotated fields (or methods) on the instance of their containing class that was passed to the CmdlineParser constructor.

"Interrogation" Stage in CmdOption

out.println("File path/name is '" + instance.file + "'.");
out.println("Verbosity level is " + instance.verbose);

CmdOption provides mechanisms to make generation of "help" or "usage" statements easier. If the @CmdOption annotation includes the element isHelp=true, CmdOption won't validate the command-line arguments when the option associated with isHelp=true is specified on the command line. This prevents error messages about missing required options or arguments from being displayed and then the method CmdlineParser.usage() can be invoked to have CmdOption print out usage/help information. A portion of code demonstrating this is shown next.

"Help" with CmdOption

@CmdOption(names={"--help","-h"}, description = "Display this help message", isHelp=true)
private boolean help;

// ...

if (instance.help)
{
   parser.usage(out);
}

The following three screen snapshots show the above code in action and using CmdOption. The first image depicts two error messages, one when no options are specified (-f/--file is required) and one when the "file" option is specified without an argument. The second image depicts the combinations of short and long option names. The third image shows the usage that is printed when the -h or --help option is specified.

There are characteristics of CmdOption to consider when selecting a framework or library to help with command-line parsing in Java.

  • CmdOption is open source and released under the Apache License, Version 2.0.
  • The de.tototec.cmdoption-0.5.0.jar is approximately 82 KB in size and requires no third-party dependencies.
  • CmdOption 0.5.0 is compiled with "major version: 49", meaning that it's compatible with J2SE 5 applications. Although there are multiple libraries covered in this series that have similar annotations to CmdOption's, this ability to work with an older version of Java may be a differentiator in some cases.
  • CmdOption is still being supported; the version covered in this post (0.5.0) was updated earlier this month (9 October 2017).

CmdOption is an easy-to-use library for parsing command-line options in Java. It comes with a liberal open source license and has received recent updates.

Additional References

Thursday, October 19, 2017

Java Command-Line Interfaces (Part 25): JCommando

JCommando is described on the JCommando site as "a Java argument parser for command-line parameters." JCommando reads XML configuration to generate a Java class that handles parsing from a Java application. The only Java-based library previously covered in this series of posts on Java command-line parsing libraries that provided XML configuration is JSAP, but it's a secondary form of configuration with that library and I did not cover XML configuration in my post on JSAP.

Because JCommando uses XML to specify command line options to be parsed, the "definition" stage with JCommando is accomplished via XML specification. As with the previous posts in this series, the examples in this post are based on command line options for file path and name and verbosity and their definition in JCommando-compliant XML is shown in the next code listing (options.xml).

JCommando via XML Portion of "Definition" Stage: options.xml

<jcommando>
   <option id="file" long="file" short="f" type="String">
      <description>Path and name of file</description>
   </option>
   <option id="verbose" long="verbose" short="v">
      <description>Verbosity enabled</description>
   </option>
   <commandless id="execute" allow-optionless="true">
      <or>
         <option-ref id="file" />
      </or>
   </commandless>
</jcommando>

JCommando uses the XML file an input and, based on that XML, generates a Java source code file that parses the options specified in the XML. There are two ways to instruct JCommando to parse this XML and use the details to generate Java source code. One way is to use the executable jcomgen executable provided with the JCommando distribution (in its bin directory). The second approach for generating a Java class from the XML is the approach shown here: using Apache Ant and a JCommando-provided Ant task. This is demonstrated in the next XML/Ant listing.

Ant Target for Generating Source from XML with JCommando

  <target name="generateSourceForJCommando"
          description="Generate command line parsing source code that uses JCommando">
    <taskdef name="jcommando" classname="org.jcommando.ant.JCommando">
      <classpath>
        <pathelement location="C:\lib\jcommando-1.2\lib\jcommando.jar"/>
      </classpath>
    </taskdef>

    <jcommando inputfile="jcommando/options.xml"
               classname="MainParser"
               destdir="src"
               packagename="examples.dustin.commandline.jcommando"/>
  </target>

The above Ant target shows how JCommando allows the input XML file (options.xml) to be specified as the "inputfile" and that the generated Java source code file will be placed in the src directory in a subdirectory structure matching the designated package "examples.dustin.commandline.jcommando". The execution of the Ant target and source code generation is shown in the next screen snapshot.

The result of this Ant target is the generated Java source class MainParser.java whose listing is shown next.

Generated Java Source Class MainParser.java

/*
 * THIS IS A GENERATED FILE.  DO NOT EDIT.
 *
 * JCommando (http://jcommando.sourceforge.net)
 */

package examples.dustin.commandline.jcommando;

import org.jcommando.Command;
import org.jcommando.JCommandParser;
import org.jcommando.Option;
import org.jcommando.Grouping;
import org.jcommando.And;
import org.jcommando.Or;
import org.jcommando.Xor;
import org.jcommando.Not;

/**
 * JCommando generated parser class.
 */
public abstract class MainParser extends JCommandParser
{
   /**
     * JCommando generated constructor.
     */
   public MainParser()
   {
      Option file = new Option();
      file.setId("file");
      file.setShortMnemonic("f");
      file.setLongMnemonic("file");
      file.setDescription("Path and name of file");
      addOption(file);

      Option verbose = new Option();
      verbose.setId("verbose");
      verbose.setShortMnemonic("v");
      verbose.setLongMnemonic("verbose");
      verbose.setDescription("Verbosity enabled");
      addOption(verbose);

      Command execute = new Command();
      execute.setName("commandless");
      execute.setId("execute");
      execute.addOption(file);
      execute.setGrouping( createExecuteGrouping() );
      addCommand(execute);

   }

   /**
     * Called by parser to set the 'file' property.
     *
     * @param file the value to set.
     */
   public abstract void setFile(String file);

   /**
     * Called by parser to set the 'verbose' property.
     *
     */
   public abstract void setVerbose();

   /**
     * Called by parser to perform the 'execute' command.
     *
     */
   public abstract void doExecute();

   /**
    * Generate the grouping for the 'execute' command.
    */
   private Grouping createExecuteGrouping()
   {
      Or or1 = new Or();
      or1.addOption(getOptionById("file"));
      return or1;
   }
}

With the Java source code generated, we now have our options definitions. A custom class is written to extend the generated MainParser and to access its parent for parsing. This is demonstrated in the next code listing of the custom written Main class that extends the generated MainParser class.

Custom Class Extending Generated Class

package examples.dustin.commandline.jcommando;

import static java.lang.System.out;

/**
 * Demonstrates JCommando-based parsing of command-line
 * arguments from Java code.
 */
public class Main extends MainParser
{
   private String file;
   private boolean verbose;

   @Override
   public void setFile(final String newFilePathAndName)
   {
      file = newFilePathAndName;
   }

   @Override
   public void setVerbose()
   {
      verbose = true;
   }

   public static void main(final String[] arguments)
   {
      final Main instance = new Main();
      instance.parse(arguments);
   }

   /**
    * Called by parser to execute the 'command'.
    */
   public void doExecute()
   {
      out.println("File path/name is " + file + " and verbosity is " + verbose);
   }
}

As shown in the custom Main.java source code shown above, the "parsing" stage is accomplished in JCommando via execution of the parse(String[]) method inherited from the class that JCommando generated based on the configuration XML (and that generated class gets its definition of that parse method from its parent JCommandParser class).

The custom class that extends the generated class needed to have the "set" methods for the options implemented. With these properly implemented, the "interrogation" stage in JCommando-based applications is as simple as accessing the fields set by those custom implemented "set" methods. This was demonstrated in the doExecute() method shown in the last code listing. That doExecute method was generated as an abstract method in the generated parent class because of the specification of the <commandless> element with id of "execute" in the configuration XML.

The JCommandParser class that the custom class ultimately extends has a method printUsage() that can be used to write "help"/"usage" output to standard output. This can be seen in the source code for Main.java available on GitHub.

The next two screen snapshots demonstrate execution of the sample code discussed in this post. The first screen snapshot shows the "usage information that can be automatically printed, in this case when the required "file" option was not specified. The second screen snapshot demonstrates the combinations of long and short option names for the "vile" and "verbose" options.

The steps involved with using JCommando that have been discussed in this blog post are summarized here.

  1. Define options in XML file.
  2. Generate Java parser source code from XML using one of two approaches.
    • Use jcomgen tool provided in JCommando's bin directory.
    • Use Ant target with JCommand-provided Ant task as demonstrated in this post.
  3. Write Java class that extends generated parser class.

There are characteristics of JCommando to consider when selecting a framework or library to help with command-line parsing in Java.

  • JCommando is open source and available under the zlib/libpng License (Zlib).
  • The jcommando.jar JAR is approximately 27 KB in size and there is no third-party dependency.
  • Defining options in JCommando via XML is a different approach than the other libraries covered in this series, but what I find more interesting about JCommando's options definition is the easy ability to express relationships between options such as "and", "or", "xor", and nested combinations of these.

JCommando implements some novel concepts in terms of Java-based command line options parsing. It requires XML configuration of the potential command line options, but makes it easy to establish relationships between those options. JCommando generates Java source from the XML options configuration and a custom parsing class extends that generated class. JCommando is also the first of the libraries covered in this series to use the Zlib license.

Additional References