Monday, December 23, 2013

Determining Presence of Characters or Integers in String with Guava CharMatcher and Apache Commons Lang StringUtils

A recent Reddit post asked the question, "Is there a predefined method for checking if a variable value contains a particular character or integer?" That question-based title was also asked a different way, "A method or quick way for checking if a variable contains any numbers say or ('x',2,'B') like a list?" I am not aware of any single method call within the standard SDK libraries to do this (other than using a carefully designed regular expression), but in this post I answer those questions using Guava's CharMatcher and Apache Common Lang's StringUtils class.

Java's String class does have a contains method that can be used to determine if a single character is contained in that String or if a certain explicitly specified sequence of characters is contained in that String. However, I'm not aware of any way in a single executable statement (not counting regular expressions) to ask Java if a given String contains any of a specified set of characters without needing to contain all of them or contain them in the specified order. Both Guava and Apache Commons Lang do provide mechanisms for just this thing.

Apache Commons Lang (version 3.1 used in this post) provides overloaded StringUtils.containsAny methods that easily accomplish this request. Both overloaded versions expect the first parameter passed to them to be the String (or more precisely, the CharSequence) to be tested to see if it contains a given letter or integer. The first overloaded version, StringUtils.containsAny(CharSequence, char...) accepts zero or more char elements to be tested to see if any of them are in the String represented by the first argument. The second overloaded version, StringUtils.containsAny(CharSequence, CharSequence) expects the second argument to contain all the potential characters to be searched for in the first argument as a single sequence of characters.

The following code listing demonstrates using this Apache Commons Lang approach to determine if a given string contains certain characters. All three statements will pass their assertions because "Inspired by Actual Events" does include 'd' and 'A', but not 'Q'. Because it is only necessary for any one of the provided characters to be present to return true, the first two assertions of true pass. The third assertion passes because the string does NOT contain the only provided letter and so the negative is asserted.

Determining String Contains A Character with StringUtils
private static void demoStringContainingLetterInStringUtils()
{
   assert StringUtils.containsAny("Inspired by Actual Events", 'd', 'A');  // true: both contained
   assert StringUtils.containsAny("Inspired by Actual Events", 'd', 'Q');  // true: one contained
   assert !StringUtils.containsAny("Inspired by Actual Events", 'Q');      // true: none contained (!)
}

Guava's CharMatcher can also be used in a similar manner as demonstrated in the next code listing.

Determining String Contains A Character with CharMatcher
private static void demoStringContainingLetterInGuava()
{
   assert CharMatcher.anyOf("Inspired by Actual Events").matchesAnyOf(new String(new char[]{'d', 'A'}));
   assert CharMatcher.anyOf("Inspired by Actual Events").matchesAnyOf(new String (new char[] {'d', 'Q'}));
   assert !CharMatcher.anyOf("Inspired by Actual Events").matchesAnyOf(new String(new char[]{'Q'}));
}

What if we specifically want to make sure at least one character in a given String/CharSequence is a numeric (integer), but we cannot be guaranteed that the entire string is numerics? The same approach as used above with Apache Commons Lang's StringUtils can be applied here with the only change being that the provided letters to be matched are the numeric digits 0 through 9. This is shown in the next screen snapshot.

Determining String Contains a Numeral with StringUtils
private static void demoStringContainingNumericDigitInStringUtils()
{
   assert !StringUtils.containsAny("Inspired by Actual Events", "0123456789");
   assert StringUtils.containsAny("Inspired by Actual Events 2013", "0123456789");
}

Guava's CharMatcher has a really slick way of expressing this question of whether a provided sequence of characters includes at least one numeral. This is shown in the next code listing.

Determining String Contains a Numeral with CharMatcher
private static void demoStringContainingNumericDigitInGuava()
{
   assert !CharMatcher.DIGIT.matchesAnyOf("Inspired by Actual Events");
   assert CharMatcher.DIGIT.matchesAnyOf("Inspired by Actual Events 2013");
}

CharMatcher.DIGIT provides a concise and expressive approach to specifying that we want to match a digit. Fortunately, CharMatcher provides numerous other public fields similar to DIGIT for convenience in determining if strings contain other types of characters.

For completeness, I have included the single class containing all of the above examples in the next code listing. This class's main() function can be run with the -enableassertions (or -ea) flag set on the Java launcher and will complete without any AssertionErrors.

StringContainsDemonstrator.java
package dustin.examples.strings;

import com.google.common.base.CharMatcher;
import static java.lang.System.out;

import org.apache.commons.lang3.StringUtils;

/**
 * Demonstrate Apache Commons Lang StringUtils and Guava's CharMatcher. This
 * class exists to demonstrate Apache Commons Lang StringUtils and Guava's
 * CharMatcher support for determining if a particular character or set of
 * characters or integers is contained within a given
 * 
 * This class's tests depend on asserts being enabled, so specify the JVM option
 * -enableassertions (-ea) when running this example.
 * 
 * @author Dustin
 */
public class StringContainsDemonstrator
{
   private static final String CANDIDATE_STRING = "Inspired by Actual Events";
   private static final String CANDIDATE_STRING_WITH_NUMERAL = CANDIDATE_STRING + " 2013";
   private static final char FIRST_CHARACTER = 'd';
   private static final char SECOND_CHARACTER = 'A';
   private static final String CHARACTERS = new String(new char[]{FIRST_CHARACTER, SECOND_CHARACTER});
   private static final char NOT_CONTAINED_CHARACTER = 'Q';
   private static final String NOT_CONTAINED_CHARACTERS = new String(new char[]{NOT_CONTAINED_CHARACTER});
   private static final String MIXED_CONTAINED_CHARACTERS = new String (new char[] {FIRST_CHARACTER, NOT_CONTAINED_CHARACTER});
   private static final String NUMERIC_CHARACTER_SET = "0123456789";

   private static void demoStringContainingLetterInGuava()
   {
      assert CharMatcher.anyOf(CANDIDATE_STRING).matchesAnyOf(CHARACTERS);
      assert CharMatcher.anyOf(CANDIDATE_STRING).matchesAnyOf(MIXED_CONTAINED_CHARACTERS);
      assert !CharMatcher.anyOf(CANDIDATE_STRING).matchesAnyOf(NOT_CONTAINED_CHARACTERS);
   }

   private static void demoStringContainingNumericDigitInGuava()
   {
      assert !CharMatcher.DIGIT.matchesAnyOf(CANDIDATE_STRING);
      assert CharMatcher.DIGIT.matchesAnyOf(CANDIDATE_STRING_WITH_NUMERAL);
   }

   private static void demoStringContainingLetterInStringUtils()
   {
      assert StringUtils.containsAny(CANDIDATE_STRING, FIRST_CHARACTER, SECOND_CHARACTER);
      assert StringUtils.containsAny(CANDIDATE_STRING, FIRST_CHARACTER, NOT_CONTAINED_CHARACTER);
      assert !StringUtils.containsAny(CANDIDATE_STRING, NOT_CONTAINED_CHARACTER);
   }

   private static void demoStringContainingNumericDigitInStringUtils()
   {
      assert !StringUtils.containsAny(CANDIDATE_STRING, NUMERIC_CHARACTER_SET);
      assert StringUtils.containsAny(CANDIDATE_STRING_WITH_NUMERAL, NUMERIC_CHARACTER_SET);
   }

   /**
    * Indicate whether assertions are enabled.
    * 
    * @return {@code true} if assertions are enabled or {@code false} if
    *    assertions are not enabled (are disabled).
    */
   private static boolean areAssertionsEnabled()
   {
      boolean enabled = false; 
      assert enabled = true;
      return enabled;
   }

   /**
    * Main function for running methods to demonstrate Apache Commons Lang
    * StringUtils and Guava's CharMatcher support for determining if a particular
    * character or set of characters or integers is contained within a given
    * String.
    * 
    * @param args the command line arguments Command line arguments; none expected.
    */
   public static void main(String[] args)
   {
      if (!areAssertionsEnabled())
      {
         out.println("This class cannot demonstrate anything without assertions enabled.");
         out.println("\tPlease re-run with assertions enabled (-ea).");
         System.exit(-1);
      }

      out.println("Beginning demonstrations...");
      demoStringContainingLetterInGuava();
      demoStringContainingLetterInStringUtils();
      demoStringContainingNumericDigitInGuava();
      demoStringContainingNumericDigitInStringUtils();
      out.println("...Demonstrations Ended");
   }
}

Guava and Apache Commons Lang are very popular with Java developers because of the methods they provide beyond what the SDK provides that Java developers commonly need. In this post, I looked at how Guava's CharMatcher and Apache Commons Lang's StringUtils can be used to concisely but expressively test to determine if any of a set of specified characters exists within a provided string.

No comments: