Saturday, October 24, 2009

Java SourceVersion and Character

The SourceVersion class provides information on Java source versions and can provide some interesting details, including demonstration of terminology covered in the Java Language Specification. In this blog posting, I look briefly at some of the more interesting observations one can make using this class that was introduced with Java SE 6 in conjunction with the Java long-timer class Character.

The SourceVersion class provides several handy static methods for ascertaining details about the source version of the current Java runtime. The methods SourceVersion.latest() and SourceVersion.latestSupported() provide details regarding the latest source versions that can be "modeled" and "fully supported" respectively.

The following code snippet demonstrates these two methods in action.

  1. out.println("Latest source version that can be modeled: ");  
  2. out.println("\tSourceVersion.latest(): " + SourceVersion.latest());  
  3. out.println("Latest source version fully supported by current execution environment ");  
  4. out.println("\tSourceVersion.latestSupported(): " + SourceVersion.latestSupported());  


The output from running this code is shown next.



As the code example and corresponding output above indicate, the currently supported modeled version and currently fully supported versions are easily accessible. Although the SourceVersion class was introduced with Java SE 6, it has been built to be support future versions of Java. Not only does the Javadoc documentation state that "additional source version constants will be added to model future releases of the language," but the SourceVersion.values() method also provides all supported version enums. A code example and associated output are shown next to demonstrate this method in action.

  1. out.println("SourceVersion enum Values:");  
  2. final SourceVersion[] versions = SourceVersion.values();  
  3. for (final SourceVersion version : versions)  
  4. {  
  5.    out.println("\t" + version);  
  6. }  




The Javadoc documentation tells us the meanings of the various enum values shown in the above output. Each represents a different "source version of the Java programming language" and the platform version it is associated with. As shown earlier, the RELEASE_6 is associated with Java SE 6, RELEASE_5 is associated with J2SE 5, RELEASE_4 is associated with JDK 1.4, RELEASE_3 is associated with JDK 1.3, RELEASE_2 is associated with JDK 1.2, RELEASE_1 is associated with JDK 1.1 and RELEASE_0 is associated with "the original version." The Javadoc documentation for Java SE 7 indicates that SourceVersion.RELEASE_7 is supported in Java SE 7.

The SourceVersion class provides three static methods that each indicate whether a provided CharSequence is an identifier, keyword, or name. The three methods that allow one to dynamically determine if a particular CharSequence fits one or more of the types identifier, name, or keyword are (respectively) SourceVersion.isIdentifier(), SourceVersion.isName(), and SourceVersion.isKeyword().

Using these methods allows one to determine if a particular string is reserved as a keyword, is even considered a valid identifier, and if a string that is a valid identifier is not a keyword and is thus a valid name. The isName() method returns true for a "syntactically valid name" that is not also a keyword or literal. The isKeyword() method indicates if the provided string is one of the keywords listed here.

I have run many different strings of various combinations of these three types in the following code.

  1. public static void printIdentifierTest(final String stringToBeTested)  
  2. {  
  3.    out.println(  
  4.         "Is '" + stringToBeTested + "' an identifier? "  
  5.       + SourceVersion.isIdentifier(stringToBeTested));  
  6. }  
  7.   
  8. public static void printKeywordTest(final String stringToBeTested)  
  9. {  
  10.    out.println(  
  11.         "Is '" + stringToBeTested + "' a keyword? "  
  12.       + SourceVersion.isKeyword(stringToBeTested));  
  13. }  
  14.   
  15. public static void printNameTest(final String stringToBeTested)  
  16. {  
  17.    out.println(  
  18.         "Can '" + stringToBeTested + "' be used as a name? "  
  19.       + SourceVersion.isName(stringToBeTested));  
  20. }  
  21.   
  22. public static void printTests(final String stringToBeTested)  
  23. {  
  24.    out.println("\n===============  " + stringToBeTested + "  ===============");  
  25.    printIdentifierTest(stringToBeTested);  
  26.    printKeywordTest(stringToBeTested);  
  27.    printNameTest(stringToBeTested);  
  28. }  
  29.   
  30. public static void printTests(  
  31.    final String stringToBeTested,  
  32.    final String alternateHeaderString)  
  33. {  
  34.    out.println("\n===============  " + alternateHeaderString + "  ===============");  
  35.    printIdentifierTest(stringToBeTested);  
  36.    printKeywordTest(stringToBeTested);  
  37.    printNameTest(stringToBeTested);  
  38. }  
  39.   
  40. /** 
  41.  * Main function for demonstrating SourceVersion enum. 
  42.  * 
  43.  * @param arguments Command-line arguments: none expected. 
  44.  */  
  45. public static void main(final String[] arguments)  
  46. {  
  47.    final String dustinStr = "Dustin";  
  48.    printTests(dustinStr);  
  49.    final String dustinLowerStr = "dustin";  
  50.    printTests(dustinLowerStr);  
  51.    final String instanceOfStr = "instanceof";  
  52.    printTests(instanceOfStr);  
  53.    final String constStr = "const";  
  54.    printTests(constStr);  
  55.    final String gotoStr = "goto";  
  56.    printTests(gotoStr);  
  57.    final String trueStr = "true";  
  58.    printTests(trueStr);  
  59.    final String nullStr = "null";  
  60.    printTests(nullStr);  
  61.    final String weirdStr = "/#";  
  62.    printTests(weirdStr);  
  63.    final String tabStr = "\t";  
  64.    printTests(tabStr, "TAB (\\t)");  
  65.    final String classStr = "class";  
  66.    printTests(classStr);  
  67.    final String enumStr = "enum";  
  68.    printTests(enumStr);  
  69.    final String assertStr = "assert";  
  70.    printTests(assertStr);  
  71.    final String intStr = "int";  
  72.    printTests(intStr);  
  73.    final String numeralStartStr = "1abc";  
  74.    printTests(numeralStartStr);  
  75.    final String numeralEmbeddedStr = "abc1";  
  76.    printTests(numeralEmbeddedStr);  
  77.    final String dollarStartStr = "$dustin";  
  78.    printTests(dollarStartStr);  
  79.    final String underscoreStartStr = "_dustin";  
  80.    printTests(underscoreStartStr);  
  81.    final String spacesStartStr = " dustin";  
  82.    printTests(spacesStartStr, " dustin (space in front)");  
  83.    final String spacesInStr = "to be";  
  84.    printTests(spacesInStr);  
  85. }  


When the above code is executed the output shown next is generated.

  1. ===============  Dustin  ===============  
  2. Is 'Dustin' an identifier? true  
  3. Is 'Dustin' a keyword? false  
  4. Can 'Dustin' be used as a name? true  
  5.   
  6. ===============  dustin  ===============  
  7. Is 'dustin' an identifier? true  
  8. Is 'dustin' a keyword? false  
  9. Can 'dustin' be used as a name? true  
  10.   
  11. ===============  instanceof  ===============  
  12. Is 'instanceof' an identifier? true  
  13. Is 'instanceof' a keyword? true  
  14. Can 'instanceof' be used as a name? false  
  15.   
  16. ===============  const  ===============  
  17. Is 'const' an identifier? true  
  18. Is 'const' a keyword? true  
  19. Can 'const' be used as a name? false  
  20.   
  21. ===============  goto  ===============  
  22. Is 'goto' an identifier? true  
  23. Is 'goto' a keyword? true  
  24. Can 'goto' be used as a name? false  
  25.   
  26. ===============  true  ===============  
  27. Is 'true' an identifier? true  
  28. Is 'true' a keyword? true  
  29. Can 'true' be used as a name? false  
  30.   
  31. ===============  null  ===============  
  32. Is 'null' an identifier? true  
  33. Is 'null' a keyword? true  
  34. Can 'null' be used as a name? false  
  35.   
  36. ===============  /#  ===============  
  37. Is '/#' an identifier? false  
  38. Is '/#' a keyword? false  
  39. Can '/#' be used as a name? false  
  40.   
  41. ===============  TAB (\t)  ===============  
  42. Is ' ' an identifier? false  
  43. Is ' ' a keyword? false  
  44. Can ' ' be used as a name? false  
  45.   
  46. ===============  class  ===============  
  47. Is 'class' an identifier? true  
  48. Is 'class' a keyword? true  
  49. Can 'class' be used as a name? false  
  50.   
  51. ===============  enum  ===============  
  52. Is 'enum' an identifier? true  
  53. Is 'enum' a keyword? true  
  54. Can 'enum' be used as a name? false  
  55.   
  56. ===============  assert  ===============  
  57. Is 'assert' an identifier? true  
  58. Is 'assert' a keyword? true  
  59. Can 'assert' be used as a name? false  
  60.   
  61. ===============  int  ===============  
  62. Is 'int' an identifier? true  
  63. Is 'int' a keyword? true  
  64. Can 'int' be used as a name? false  
  65.   
  66. ===============  1abc  ===============  
  67. Is '1abc' an identifier? false  
  68. Is '1abc' a keyword? false  
  69. Can '1abc' be used as a name? false  
  70.   
  71. ===============  abc1  ===============  
  72. Is 'abc1' an identifier? true  
  73. Is 'abc1' a keyword? false  
  74. Can 'abc1' be used as a name? true  
  75.   
  76. ===============  $dustin  ===============  
  77. Is '$dustin' an identifier? true  
  78. Is '$dustin' a keyword? false  
  79. Can '$dustin' be used as a name? true  
  80.   
  81. ===============  _dustin  ===============  
  82. Is '_dustin' an identifier? true  
  83. Is '_dustin' a keyword? false  
  84. Can '_dustin' be used as a name? true  
  85.   
  86. ===============   dustin (space in front)  ===============  
  87. Is ' dustin' an identifier? false  
  88. Is ' dustin' a keyword? false  
  89. Can ' dustin' be used as a name? false  
  90.   
  91. ===============  to be  ===============  
  92. Is 'to be' an identifier? false  
  93. Is 'to be' a keyword? false  
  94. Can 'to be' be used as a name? false  


The above output demonstrates that a valid name must be a valid identifier without being a keyword. A keyword must be a valid identifier, but not all identifiers are keywords. Some string values that are not keywords or reserved words are not even identifiers because they don't meet the rules of Java identifiers.

The examples above indicate that we cannot use a name for a variable or other construct that begins with a numeral, but we can use $ and _ for the first character in a name. Another way to determine this is through use of the static method Character.isJavaIdentifierStart(char). The following code snippet demonstrates this along with the similar method Character.isJavaIdentifierPart(char), which returns true if the provided character can be in the name anywhere other than the first character.

  1. public static void printTestForValidIdentifierCharacter(  
  2.    final char characterToBeTested)  
  3. {  
  4.    out.println(  
  5.         "Character '" + characterToBeTested  
  6.       + (  Character.isJavaIdentifierStart(characterToBeTested)  
  7.          ? "': VALID "  
  8.          : "': NOT VALID ")  
  9.       + "FIRST character and "  
  10.       + (  Character.isJavaIdentifierPart(characterToBeTested)  
  11.          ? "VALID "  
  12.          : "NOT VALID ")  
  13.       + "OTHER character in a Java name.");  
  14.    out.println(  "\tType of '" + characterToBeTested + "': "  
  15.                + Character.getType(characterToBeTested));  
  16. }  
  17.   
  18. public static void demonstrateCharacterJavaIdentifierStart()  
  19. {  
  20.    out.println("\nTEST FOR FIRST AND OTHER CHARACTERS IN A VALID JAVA NAME");  
  21.    printTestForValidIdentifierCharacter('A');  
  22.    printTestForValidIdentifierCharacter('a');  
  23.    printTestForValidIdentifierCharacter('1');  
  24.    printTestForValidIdentifierCharacter('\\');  
  25.    printTestForValidIdentifierCharacter('_');  
  26.    printTestForValidIdentifierCharacter('$');  
  27.    printTestForValidIdentifierCharacter('#');  
  28.    printTestForValidIdentifierCharacter('\n');  
  29.    printTestForValidIdentifierCharacter('\t');  
  30. }  


The output from the above appears below.

  1. TEST FOR FIRST AND OTHER CHARACTERS IN A VALID JAVA NAME  
  2. Character 'A': VALID FIRST character and VALID OTHER character in a Java name.  
  3.  Type of 'A'1  
  4. Character 'a': VALID FIRST character and VALID OTHER character in a Java name.  
  5.  Type of 'a'2  
  6. Character '1': NOT VALID FIRST character and VALID OTHER character in a Java name.  
  7.  Type of '1'9  
  8. Character '\': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  9.  Type of '\'24  
  10. Character '_': VALID FIRST character and VALID OTHER character in a Java name.  
  11.  Type of '_'23  
  12. Character '$': VALID FIRST character and VALID OTHER character in a Java name.  
  13.  Type of '$'26  
  14. Character '#': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  15.  Type of '#'24  
  16. Character '  
  17. ': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  18.  Type of '  
  19. ': 15  
  20. Character ' ': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  21.  Type of ' '15  


Because the Character.getType(char) method has been with us for quite a while and predates the J2SE 5-introduced enum construct, this method returns primitives integers. One can refer to Java's Constant Field Values to determine what each of these constants stand for.

To make the above example's output a little more readable, I have added a simple "converter" method that converts the returned int to a more readable String. I have only added switch cases for the integers returned from my example, but one could add cases for all supported types represented by different integers.

  1. public static String extractReadableStringFromJavaCharacterTypeInt(  
  2.    final int characterTypeInt)  
  3. {  
  4.    String characterType;  
  5.    switch (characterTypeInt)  
  6.    {  
  7.       case Character.CONNECTOR_PUNCTUATION :  
  8.          characterType = "Connector Punctuation";  
  9.          break;  
  10.       case Character.CONTROL :  
  11.          characterType = "Control";  
  12.          break;  
  13.       case Character.CURRENCY_SYMBOL :  
  14.          characterType = "Currency Symbol";  
  15.          break;  
  16.       case Character.DECIMAL_DIGIT_NUMBER :  
  17.          characterType = "Decimal Digit Number";  
  18.          break;  
  19.       case Character.LETTER_NUMBER :  
  20.          characterType = "Letter/Number";  
  21.          break;  
  22.       case Character.LOWERCASE_LETTER :  
  23.          characterType = "Lowercase Letter";  
  24.          break;  
  25.       case Character.OTHER_PUNCTUATION :  
  26.          characterType = "Other Punctuation";  
  27.          break;  
  28.       case Character.UPPERCASE_LETTER :  
  29.          characterType = "Uppercase Letter";  
  30.          break;  
  31.       default : characterType = "Unknown Character Type Integer: " + characterTypeInt;   
  32.    }  
  33.    return characterType;  
  34. }  


When the integers returned from Character.getType(char) in the example two listings ago are run through this switch statement, the revised output appears as shown next.

  1. TEST FOR FIRST AND OTHER CHARACTERS IN A VALID JAVA NAME  
  2. Character 'A': VALID FIRST character and VALID OTHER character in a Java name.  
  3.  Type of 'A': Uppercase Letter  
  4. Character 'a': VALID FIRST character and VALID OTHER character in a Java name.  
  5.  Type of 'a': Lowercase Letter  
  6. Character '1': NOT VALID FIRST character and VALID OTHER character in a Java name.  
  7.  Type of '1': Decimal Digit Number  
  8. Character '\': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  9.  Type of '\': Other Punctuation  
  10. Character '_': VALID FIRST character and VALID OTHER character in a Java name.  
  11.  Type of '_': Connector Punctuation  
  12. Character '$': VALID FIRST character and VALID OTHER character in a Java name.  
  13.  Type of '$': Currency Symbol  
  14. Character '#': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  15.  Type of '#': Other Punctuation  
  16. Character '  
  17. ': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  18.  Type of '  
  19. ': Control  
  20. Character ' ': NOT VALID FIRST character and NOT VALID OTHER character in a Java name.  
  21.  Type of ' ': Control  


The SourceVersion class is useful for dynamically determining information about the Java source code version and the keywords and valid names applicable for that version. The Character class also provides useful information on what a particular character's type is and whether or not that character can be used as the first character of a name or as any other character in a valid name.

No comments: