Saturday, December 15, 2012

Groovy: Multiple Values for a Single Command-line Option

One of the many features that makes Groovy an attractive scripting language is its built-in command-line argument support via CliBuilder. I have written about CliBuilder before in the posts Customizing Groovy's CliBuilder Usage Statements and Explicitly Specifying 'args' Property with Groovy CliBuilder. In this post, I look at Groovy's CliBuilder's support for multiple arguments passed via a single command-line flag.

The Groovy API Documentation includes this sentence about CliBuilder:

Note the use of some special notation. By adding 's' onto an option that may appear multiple times and has an argument or as in this case uses a valueSeparator to separate multiple argument values causes the list of associated argument values to be returned.

As this documentation states, Groovy's built-in CliBuilder support allows a parsed command line flag to be treated as having multiple values and the convention for referencing this argument is to add an "s" after the "short" name of the command-line option. Doing so makes the multiple values associated with a single flag available as a collection of Strings that can be easily iterated to access the multiple values.

In the post Customizing Groovy's CliBuilder Usage Statements, I briefly looked at the feature supporting multiple values passed to the script via a single command line argument. I described the feature in that post as follows:

The use of multiple values for a single argument can also be highly useful. The direct use of Apache Commons CLI's Option class (and specifically its UNLIMITED_VALUES constant field) allows the developer to communicate to CliBuilder that there is a variable number of values that need to be parsed for this option. The character that separates these multiple values (a common in this example) must also be specified by specifying the character via "valueSeparator."

The usefulness of this Apache CLI-powered Groovy feature can be demonstrated by adapting a script for finding class files contained in JAR files that I talked about in the post Searching JAR Files with Groovy. The script in that post searched one directory recursively for a single specified String contained as an entry in the searched JARs. A few minor tweaks to this script changes it so that it can support multiple specified directories to recursively search for multiple expressions.

The revised script is shown next.

#!/usr/bin/env groovy

/**
 * findClassesInJars.groovy
 *
 * findClassesInJars.groovy -d <<root_directories>> -s <<strings_to_search_for>>
 *
 * Script that looks for provided String in JAR files (assumed to have .jar
 * extensions) in the provided directory and all of its subdirectories.
 */

def cli = new CliBuilder(
   usage: 'findClassesInJars.groovy -d <root_directories> -s <strings_to_search_for>',
   header: '\nAvailable options (use -h for help):\n',
   footer: '\nInformation provided via above options is used to generate printed string.\n')
import org.apache.commons.cli.Option
cli.with
{
   h(longOpt: 'help', 'Help', args: 0, required: false)
   d(longOpt: 'directories', 'Two arguments, separated by a comma', args: Option.UNLIMITED_VALUES, valueSeparator: ',', required: true)
   s(longOpt: 'strings', 'Strings (class names) to search for in JARs', args: Option.UNLIMITED_VALUES, valueSeparator: ',', required: true)
}
def opt = cli.parse(args)
if (!opt) return
if (opt.h) cli.usage()

def directories = opt.ds
def stringsToSearchFor = opt.ss

import java.util.zip.ZipFile
import java.util.zip.ZipException

def matches = new TreeMap<String, Set<String>>()
directories.each
{ directory ->
   def dir = new File(directory)
   stringsToSearchFor.each
   { stringToFind ->
      dir.eachFileRecurse
      { file->
         if (file.isFile() && file.name.endsWith("jar"))
         {
            try
            {
               zip = new ZipFile(file)
               entries = zip.entries()
               entries.each
               { entry->
                  if (entry.name.contains(stringToFind))
                  {
                     def pathPlusMatch = "${file.canonicalPath} [${entry.name}]"
                     if (matches.get(stringToFind))
                     {
                        matches.get(stringToFind).add(pathPlusMatch)
                     }
                     else
                     {
                        def containingJars = new TreeSet<String>()
                        containingJars.add(pathPlusMatch)
                        matches.put(stringToFind, containingJars)
                     }
                  }
               }
            }
            catch (ZipException zipEx)
            {
               println "Unable to open file ${file.name}"
            }
         }
      }
   }
}

matches.each
{ searchString, containingJarNames ->
   println "String '${searchString}' Found:"
   containingJarNames.each
   { containingJarName ->
      println "\t${containingJarName}"
   }
}

Lines 11 through 28 are where Groovy's internal CliBuilder is applied. The "directories" (short name of 'd') and "strings" (short name of 's') command-line flags are set up in lines 20 and 21. Those lines use the Option.UNLIMITED_VALUES to specify multiple values applicable for each argument and they also use valueSeparator to specify the token separating the multiple values for each flag (comma in these cases).

Lines 27-28 obtain the multiple values for each argument. Although the options had short names of 'd' and 's', appending 's' to each of them (now 'ds' and 'ss') allows their multiple values to be accessed. The rest of the script takes advantage of these and iterates over the multiple strings associated with each flag.

The next screen snapshot demonstrates the above script being executed.

The above screen snapshot demonstrates the utility of being able to provide multiple values for a single command-line flag. Groovy's built-in support for Apache CLI makes it easy to employ customizable command-line parsing.

No comments: