Showing posts with label XML. Show all posts
Showing posts with label XML. Show all posts

Thursday, June 25, 2015

Generating JSON Schema from XSD with JAXB and Jackson

In this post, I demonstrate one approach for generating JSON Schema from an XML Schema (XSD). While providing an overview of an approach for creating JSON Schema from XML Schema, this post also demonstrates use of a JAXB implementation (xjc version 2.2.12-b150331.1824 bundled with JDK 9 [build 1.9.0-ea-b68]) and of a JSON/Java binding implementation (Jackson 2.5.4).

The steps of this approach for generating JSON Schema from an XSD can be summarized as:

  1. Apply JAXB's xjc compiler to generate Java classes from XML Schema (XSD).
  2. Apply Jackson to generate JSON schema from JAXB-generated Java classes.

Generating Java Classes from XSD with JAXB's xjc

For purposes of this discussion, I'll be using the simple Food.xsd used in my previous blog post A JAXB Nuance: String Versus Enum from Enumerated Restricted XSD String. For convenience, I have reproduced that simple schema here without the XML comments specific to that earlier blog post:

Food.xsd
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:dustin="http://marxsoftware.blogspot.com/foodxml"
           targetNamespace="http://marxsoftware.blogspot.com/foodxml"
           elementFormDefault="qualified"
           attributeFormDefault="unqualified">

   <xs:element name="Food">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="Vegetable" type="dustin:Vegetable" />
            <xs:element ref="dustin:Fruit" />
            <xs:element name="Dessert" type="dustin:Dessert" />
         </xs:sequence>
      </xs:complexType>
   </xs:element>

   <xs:simpleType name="Vegetable">
      <xs:restriction base="xs:string">
         <xs:enumeration value="Carrot"/>
         <xs:enumeration value="Squash"/>
         <xs:enumeration value="Spinach"/>
         <xs:enumeration value="Celery"/>
      </xs:restriction>
   </xs:simpleType>

   <xs:element name="Fruit">
      <xs:simpleType>
         <xs:restriction base="xs:string">
            <xs:enumeration value="Watermelon"/>
            <xs:enumeration value="Apple"/>
            <xs:enumeration value="Orange"/>
            <xs:enumeration value="Grape"/>
         </xs:restriction>
      </xs:simpleType>
   </xs:element>

   <xs:simpleType name="Dessert">
      <xs:restriction base="xs:string">
         <xs:enumeration value="Pie"/>
         <xs:enumeration value="Cake"/>
         <xs:enumeration value="Ice Cream"/>
      </xs:restriction>
   </xs:simpleType>

</xs:schema>

It is easy to use the xjc command line tool provided by the JDK-provided JAXB implementation to generate Java classes corresponding to this XSD. The next screen snapshot shows this process using the command:

      xjc -d jaxb .\Food.xsd

This simple command generates Java classes corresponding to the provided Food.xsd and places those classes in the specified "jaxb" subdirectory.

Generating JSON from JAXB-Generated Classes with Jackson

With the JAXB-generated classes now available, Jackson can be applied to these classes to generate JSON from the Java classes. Jackson is described on its main portal page as "a multi-purpose Java library for processing" that is "inspired by the quality and variety of XML tooling available for the Java platform." The existence of Jackson and similar frameworks and libraries appears to be one of the reasons that Oracle has dropped the JEP 198 ("Light-Weight JSON API") from Java SE 9. [It's worth noting that Java EE 7 already has built-in JSON support with its implementation of JSR 353 ("Java API for JSON Processing"), which is not associated with JEP 198).]

One of the first steps of applying Jackson to generating JSON from our JAXB-generated Java classes is to acquire and configure an instance of Jackson's ObjectMapper class. One approach for accomplishing this is shown in the next code listing.

Acquiring and Configuring Jackson ObjectMapper for JAXB Serialization/Deserialization
/**
 * Create instance of ObjectMapper with JAXB introspector
 * and default type factory.
 *
 * @return Instance of ObjectMapper with JAXB introspector
 *    and default type factory.
 */
private ObjectMapper createJaxbObjectMapper()
{
   final ObjectMapper mapper = new ObjectMapper();
   final TypeFactory typeFactory = TypeFactory.defaultInstance();
   final AnnotationIntrospector introspector = new JaxbAnnotationIntrospector(typeFactory);
   // make deserializer use JAXB annotations (only)
   mapper.getDeserializationConfig().with(introspector);
   // make serializer use JAXB annotations (only)
   mapper.getSerializationConfig().with(introspector);
   return mapper;
}

The above code listing demonstrates acquiring the Jackson ObjectMapper instance and configuring it to use a default type factory and a JAXB-oriented annotation introspector.

With the Jackson ObjectMapper instantiated and appropriately configured, it's easy to use that ObjectMapper instance to generate JSON from the generated JAXB classes. One way to accomplish this using the deprecated Jackson class JsonSchema is demonstrated in the next code listing.

Generating JSON from Java Classes with Deprecated com.fasterxml.jackson.databind.jsonschema.JsonSchema Class
/**
 * Write JSON Schema to standard output based upon Java source
 * code in class whose fully qualified package and class name
 * have been provided.
 *
 * @param mapper Instance of ObjectMapper from which to
 *     invoke JSON schema generation.
 * @param fullyQualifiedClassName Name of Java class upon
 *    which JSON Schema will be extracted.
 */
private void writeToStandardOutputWithDeprecatedJsonSchema(
   final ObjectMapper mapper, final String fullyQualifiedClassName)
{
   try
   {
      final JsonSchema jsonSchema = mapper.generateJsonSchema(Class.forName(fullyQualifiedClassName));
      out.println(jsonSchema);
   }
   catch (ClassNotFoundException cnfEx)
   {
      err.println("Unable to find class " + fullyQualifiedClassName);
   }
   catch (JsonMappingException jsonEx)
   {
      err.println("Unable to map JSON: " + jsonEx);
   }
}

The code in the above listing instantiates acquires the class definition of the provided Java class (the highest level Food class generated by the JAXB xjc compiler in my example) and passes that reference to the JAXB-generated class to ObjectMapper's generateJsonSchema(Class<?>) method. The deprecated JsonSchema class's toString() implementation is very useful and makes it easy to write out the JSON generated from the JAXB-generated classes.

For purposes of this demonstration, I provide the demonstration driver as a main(String[]) function. That function and the entire class to this point (including methods shown above) is provided in the next code listing.

JsonGenerationFromJaxbClasses.java, Version 1
package dustin.examples.jackson;

import com.fasterxml.jackson.databind.AnnotationIntrospector;
import com.fasterxml.jackson.databind.JsonMappingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.type.TypeFactory;
import com.fasterxml.jackson.module.jaxb.JaxbAnnotationIntrospector;

import com.fasterxml.jackson.databind.jsonschema.JsonSchema;

import static java.lang.System.out;
import static java.lang.System.err;

/**
 * Generates JavaScript Object Notation (JSON) from Java classes
 * with Java API for XML Binding (JAXB) annotations.
 */
public class JsonGenerationFromJaxbClasses
{
   /**
    * Create instance of ObjectMapper with JAXB introspector
    * and default type factory.
    *
    * @return Instance of ObjectMapper with JAXB introspector
    *    and default type factory.
    */
   private ObjectMapper createJaxbObjectMapper()
   {
      final ObjectMapper mapper = new ObjectMapper();
      final TypeFactory typeFactory = TypeFactory.defaultInstance();
      final AnnotationIntrospector introspector = new JaxbAnnotationIntrospector(typeFactory);
      // make deserializer use JAXB annotations (only)
      mapper.getDeserializationConfig().with(introspector);
      // make serializer use JAXB annotations (only)
      mapper.getSerializationConfig().with(introspector);
      return mapper;
   }

   /**
    * Write out JSON Schema based upon Java source code in
    * class whose fully qualified package and class name have
    * been provided.
    *
    * @param mapper Instance of ObjectMapper from which to
    *     invoke JSON schema generation.
    * @param fullyQualifiedClassName Name of Java class upon
    *    which JSON Schema will be extracted.
    */
   private void writeToStandardOutputWithDeprecatedJsonSchema(
      final ObjectMapper mapper, final String fullyQualifiedClassName)
   {
      try
      {
         final JsonSchema jsonSchema = mapper.generateJsonSchema(Class.forName(fullyQualifiedClassName));
         out.println(jsonSchema);
      }
      catch (ClassNotFoundException cnfEx)
      {
         err.println("Unable to find class " + fullyQualifiedClassName);
      }
      catch (JsonMappingException jsonEx)
      {
         err.println("Unable to map JSON: " + jsonEx);
      }
   }

   /**
    * Accepts the fully qualified (full package) name of a
    * Java class with JAXB annotations that will be used to
    * generate a JSON schema.
    *
    * @param arguments One argument expected: fully qualified
    *     package and class name of Java class with JAXB
    *     annotations.
    */
   public static void main(final String[] arguments)
   {
      if (arguments.length < 1)
      {
         err.println("Need to provide the fully qualified name of the highest-level Java class with JAXB annotations.");
         System.exit(-1);
      }
      final JsonGenerationFromJaxbClasses instance = new JsonGenerationFromJaxbClasses();
      final String fullyQualifiedClassName = arguments[0];
      final ObjectMapper objectMapper = instance.createJaxbObjectMapper();
      instance.writeToStandardOutputWithDeprecatedJsonSchema(objectMapper, fullyQualifiedClassName);
   }
}

To run this relatively generic code against the Java classes generated by JAXB's xjc based upon Food.xsd, I need to provide the fully qualified package name and class name of the highest-level generated class. In this case, that's com.blogspot.marxsoftware.foodxml.Food (package name is based on the XSD's namespace because I did not explicitly override that when running xjc). When I run the above code with that fully qualified class name and with the JAXB classes and Jackson libraries on the classpath, I see the following JSON written to standard output.

Generated JSON
{"type":"object","properties":{"vegetable":{"type":"string","enum":["CARROT","SQUASH","SPINACH","CELERY"]},"fruit":{"type":"string"},"dessert":{"type":"string","enum":["PIE","CAKE","ICE_CREAM"]}}}

Humans (which includes many developers) prefer prettier print than what was just shown for the generated JSON. We can tweak the implementation of the demonstration class's method writeToStandardOutputWithDeprecatedJsonSchema(ObjectMapper, String) as shown below to write out indented JSON that better reflects its hierarchical nature. This modified method is shown next.

Modified writeToStandardOutputWithDeprecatedJsonSchema(ObjectMapper, String) to Write Indented JSON
/**
 * Write out indented JSON Schema based upon Java source
 * code in class whose fully qualified package and class
 * name have been provided.
 *
 * @param mapper Instance of ObjectMapper from which to
 *     invoke JSON schema generation.
 * @param fullyQualifiedClassName Name of Java class upon
 *    which JSON Schema will be extracted.
 */
private void writeToStandardOutputWithDeprecatedJsonSchema(
   final ObjectMapper mapper, final String fullyQualifiedClassName)
{
   try
   {
      final JsonSchema jsonSchema = mapper.generateJsonSchema(Class.forName(fullyQualifiedClassName));
      out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(jsonSchema));
   }
   catch (ClassNotFoundException cnfEx)
   {
      err.println("Unable to find class " + fullyQualifiedClassName);
   }
   catch (JsonMappingException jsonEx)
   {
      err.println("Unable to map JSON: " + jsonEx);
   }
   catch (JsonProcessingException jsonEx)
   {
      err.println("Unable to process JSON: " + jsonEx);
   }
}

When I run the demonstration class again with this modified method, the JSON output is more aesthetically pleasing:

Generated JSON with Indentation Communicating Hierarchy
{
  "type" : "object",
  "properties" : {
    "vegetable" : {
      "type" : "string",
      "enum" : [ "CARROT", "SQUASH", "SPINACH", "CELERY" ]
    },
    "fruit" : {
      "type" : "string"
    },
    "dessert" : {
      "type" : "string",
      "enum" : [ "PIE", "CAKE", "ICE_CREAM" ]
    }
  }
}

I have been using Jackson 2.5.4 in this post. The class com.fasterxml.jackson.databind.jsonschema.JsonSchema is deprecated in that version with the comment, "Since 2.2, we recommend use of external JSON Schema generator module." Given that, I now look at using the new preferred approach (Jackson JSON Schema Module approach).

The most significant change is to use the JsonSchema class in the com.fasterxml.jackson.module.jsonSchema package rather than using the JsonSchema class in the com.fasterxml.jackson.databind.jsonschema package. The approaches for obtaining instances of these different versions of JsonSchema classes are also different. The next code listing demonstrates using the newer, preferred approach for generating JSON from Java classes.

Using Jackson's Newer and Preferred com.fasterxml.jackson.module.jsonSchema.JsonSchema
/**
 * Write out JSON Schema based upon Java source code in
 * class whose fully qualified package and class name have
 * been provided. This method uses the newer module JsonSchema
 * class that replaces the deprecated databind JsonSchema.
 *
 * @param fullyQualifiedClassName Name of Java class upon
 *    which JSON Schema will be extracted.
 */
private void writeToStandardOutputWithModuleJsonSchema(
   final String fullyQualifiedClassName)
{
   final SchemaFactoryWrapper visitor = new SchemaFactoryWrapper();
   final ObjectMapper mapper = new ObjectMapper();
   try
   {
      mapper.acceptJsonFormatVisitor(mapper.constructType(Class.forName(fullyQualifiedClassName)), visitor);
      final com.fasterxml.jackson.module.jsonSchema.JsonSchema jsonSchema = visitor.finalSchema();
      out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(jsonSchema));
   }
   catch (ClassNotFoundException cnfEx)
   {
      err.println("Unable to find class " + fullyQualifiedClassName);
   }
   catch (JsonMappingException jsonEx)
   {
      err.println("Unable to map JSON: " + jsonEx);
   }
   catch (JsonProcessingException jsonEx)
   {
      err.println("Unable to process JSON: " + jsonEx);
   }
}

The following table compares usage of the two Jackson JsonSchema classes side-by-side with the deprecated approach shown earlier on the left (adapted a bit for this comparison) and the recommended newer approach on the right. Both generate the same output for the same given Java class from which JSON is to be written.

/**
 * Write out JSON Schema based upon Java source code in
 * class whose fully qualified package and class name have
 * been provided. This method uses the deprecated JsonSchema
 * class in the "databind.jsonschema" package
 * {@see com.fasterxml.jackson.databind.jsonschema}.
 *
 * @param fullyQualifiedClassName Name of Java class upon
 *    which JSON Schema will be extracted.
 */
private void writeToStandardOutputWithDeprecatedDatabindJsonSchema(
   final String fullyQualifiedClassName)
{
   final ObjectMapper mapper = new ObjectMapper();
   try
   {
      final com.fasterxml.jackson.databind.jsonschema.JsonSchema jsonSchema =
         mapper.generateJsonSchema(Class.forName(fullyQualifiedClassName));
      out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(jsonSchema));
   }
   catch (ClassNotFoundException cnfEx)
   {
      err.println("Unable to find class " + fullyQualifiedClassName);
   }
   catch (JsonMappingException jsonEx)
   {
      err.println("Unable to map JSON: " + jsonEx);
   }
   catch (JsonProcessingException jsonEx)
   {
      err.println("Unable to process JSON: " + jsonEx);
   }
}
/**
 * Write out JSON Schema based upon Java source code in
 * class whose fully qualified package and class name have
 * been provided. This method uses the newer module JsonSchema
 * class that replaces the deprecated databind JsonSchema.
 *
 * @param fullyQualifiedClassName Name of Java class upon
 *    which JSON Schema will be extracted.
 */
private void writeToStandardOutputWithModuleJsonSchema(
   final String fullyQualifiedClassName)
{
   final SchemaFactoryWrapper visitor = new SchemaFactoryWrapper();
   final ObjectMapper mapper = new ObjectMapper();
   try
   {
      mapper.acceptJsonFormatVisitor(mapper.constructType(Class.forName(fullyQualifiedClassName)), visitor);
      final com.fasterxml.jackson.module.jsonSchema.JsonSchema jsonSchema = visitor.finalSchema();
      out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(jsonSchema));
   }
   catch (ClassNotFoundException cnfEx)
   {
      err.println("Unable to find class " + fullyQualifiedClassName);
   }
   catch (JsonMappingException jsonEx)
   {
      err.println("Unable to map JSON: " + jsonEx);
   }
   catch (JsonProcessingException jsonEx)
   {
      err.println("Unable to process JSON: " + jsonEx);
   }
}

This blog post has shown two approaches using different versions of classes with name JsonSchema provided by Jackson to write JSON based on Java classes generated from an XSD with JAXB's xjc. The overall process demonstrated in this post is one approach for generating JSON Schema from XML Schema.

Thursday, March 19, 2015

Validating XML Against XSD(s) in Java

There are numerous tools available for validating an XML document against an XSD. These include operating system scripts and tools such as xmllint, XML editors and IDEs, and even online validators. I have found it useful to have my own easy-to-use XML validation tool because of limitations or issues of the previously mentioned approaches. Java makes it easy to write such a tool and this post demonstrates how easy it is to develop a simple XML validation tool in Java.

The Java tool developed in this post requires JDK 8. However, the simple Java application can be modified fairly easily to work with JDK 7 or even with a version of Java as old as JDK 5. In most cases, I have tried to comment the code that requires JDK 7 or JDK 8 to identify these dependencies and provide alternative approaches in earlier versions of Java. I have done this so that the tool can be adapted to work even in environments with older versions of Java.

The complete code listing for the Java-based XML validation tool discussed in this post is included at the end of the post. The most significant lines of code from that application when discussing validation of XML against one or more XSDs is shown next.

Essence of Validating XML Against XSD with Java
final Schema schema = schemaFactory.newSchema(xsdSources);
final Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File(xmlFilePathAndName)));

The previous code listing shows the straightforward approach available in the standard JDK for validating XML against XSDs. An instance of javax.xml.validation.Schema is instantiated with a call to javax.xml.validation.SchemaFactory.newSchema(Source[]) (where the array of javax.xml.transform.Source objects represents one or more XSDs). An instance of javax.xml.validation.Validator is obtained from the Schema instance via Schema's newValidator() method. The XML to be validated can be passed to that Validator's validate(Source) method to perform the validation of the XML against the XSD or XSDs originally provided to the Schema object created with SchemaFactory.newSchema(Source[]).

The next code listing includes the code just highlighted but represents the entire method in which that code resides.

validateXmlAgainstXsds(String, String[])
/**
 * Validate provided XML against the provided XSD schema files.
 *
 * @param xmlFilePathAndName Path/name of XML file to be validated;
 *    should not be null or empty.
 * @param xsdFilesPathsAndNames XSDs against which to validate the XML;
 *    should not be null or empty.
 */
public static void validateXmlAgainstXsds(
   final String xmlFilePathAndName, final String[] xsdFilesPathsAndNames)
{
   if (xmlFilePathAndName == null || xmlFilePathAndName.isEmpty())
   {
      out.println("ERROR: Path/name of XML to be validated cannot be null.");
      return;
   }
   if (xsdFilesPathsAndNames == null || xsdFilesPathsAndNames.length < 1)
   {
      out.println("ERROR: At least one XSD must be provided to validate XML against.");
      return;
   }
   final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

   final StreamSource[] xsdSources = generateStreamSourcesFromXsdPathsJdk8(xsdFilesPathsAndNames);

   try
   {
      final Schema schema = schemaFactory.newSchema(xsdSources);
      final Validator validator = schema.newValidator();
      out.println(  "Validating " + xmlFilePathAndName + " against XSDs "
                  + Arrays.toString(xsdFilesPathsAndNames) + "...");
      validator.validate(new StreamSource(new File(xmlFilePathAndName)));
   }
   catch (IOException | SAXException exception)  // JDK 7 multi-exception catch
   {
      out.println(
           "ERROR: Unable to validate " + xmlFilePathAndName
         + " against XSDs " + Arrays.toString(xsdFilesPathsAndNames)
         + " - " + exception);
   }
   out.println("Validation process completed.");
}

The code listing for the validateXmlAgainstXsds(String, String[]) method shows how a SchemaFactory instance can be obtained with the specified type of schema (XMLConstants.W3C_XML_SCHEMA_NS_URI). This method also handles the various types of exceptions that might be thrown during the validation process. As the comment in the code states, the JDK 7 language change supporting catching of multiple exceptions in a single catch clause is used in this method but could be replaced with separate catch clauses or catching of a single more general exception for code bases earlier than JDK 7.

The method just shown calls a method called generateStreamSourcesFromXsdPathsJdk8(String[]) and the next listing is of that invoked method.

generateStreamSourcesFromXsdPathsJdk8(String[])
/**
 * Generates array of StreamSource instances representing XSDs
 * associated with the file paths/names provided and use JDK 8
 * Stream API.
 *
 * This method can be commented out if using a version of
 * Java prior to JDK 8.
 *
 * @param xsdFilesPaths String representations of paths/names
 *    of XSD files.
 * @return StreamSource instances representing XSDs.
 */
private static StreamSource[] generateStreamSourcesFromXsdPathsJdk8(
   final String[] xsdFilesPaths)
{
   return Arrays.stream(xsdFilesPaths)
                .map(StreamSource::new)
                .collect(Collectors.toList())
                .toArray(new StreamSource[xsdFilesPaths.length]);
}

The method just shown uses JDK 8 stream support to convert the array of Strings representing paths/names of XSD files to instances of StreamSource based on the contents of the XSDs pointed to by the path/name Strings. In the class's complete code listing, there is also a deprecated method generateStreamSourcesFromXsdPathsJdk7(final String[]) that could be used instead of this method for code bases on a version of Java earlier than JDK 8.

This single-class Java application is most useful when it's executed from the command line. To enable this, a main function is defined as shown in the next code listing.

Executable main(String[]) Function
/**
 * Validates provided XML against provided XSD.
 *
 * @param arguments XML file to be validated (first argument) and
 *    XSD against which it should be validated (second and later
 *    arguments).
 */
public static void main(final String[] arguments)
{
   if (arguments.length < 2)
   {
      out.println("\nUSAGE: java XmlValidator <xmlFile> <xsdFile1> ... <xsdFileN>\n");
      out.println("\tOrder of XSDs can be significant (place XSDs that are");
      out.println("\tdependent on other XSDs after those they depend on)");
      System.exit(-1);
   }
   // Arrays.copyOfRange requires JDK 6; see
   // http://stackoverflow.com/questions/7970486/porting-arrays-copyofrange-from-java-6-to-java-5
   // for additional details for versions of Java prior to JDK 6.
   final String[] schemas = Arrays.copyOfRange(arguments, 1, arguments.length);
   validateXmlAgainstXsds(arguments[0], schemas);
}

The executable main(String[]) function prints a usage statement if fewer than two command line arguments are passed to it because it expects at least the name/path of the XML file to be validated and the name/path of an XSD to validate the XML against.

The main function takes the first command line argument and treats that as the XML file's path/name and then treats all remaining command lin arguments as the paths/names of one or more XSDs.

The simple Java tool for validating XML against one or more XSDs has now been shown (complete code listing is at bottom of post). With it in place, we can run it against an example XML file and associated XSDs. For this demonstration, I'm using a very simple manifestation of a Servlet 2.5 web.xml deployment descriptor.

Sample Valid Servlet 2.5 web.xml
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
         version="2.5"> 

    <display-name>Sample Java Servlet 2.5 Web Application</display-name>
</web-app>

The simple web.xml file just shown is valid per the Servlet 2.5 XSDs and the output of running this simple Java-based XSD validation tool prove that by not reporting any validation errors.

An XSD-valid XML file does not lead to very interesting results with this tool. The next code listing shows an intentionally invalid web.xml file that has a "title" element not specified in the associated Servlet 2.5 XSD. The output with the most significant portions of the error message highlighted is shown after the code listing.

Sample Invalid Servlet 2.5 web.xml (web-invalid.xml)
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
         version="2.5">

    <display-name>Java Servlet 2.5 Web Application</display-name>
    <title>A handy example</title>
</web-app>

As the last output shows, things are more interesting in terms of output when the provided XML is not XSD valid.

There is one important caveat I wish to emphasize here. The XSDs provided to this Java-based tool sometimes need to be specified in a particular order. In particular, XSDs with "include" dependencies on other XSDs should be listed on the command line AFTER the XSD they include. In other words, XSDs with no "include" dependencies will generally be provided on the command line before those XSDs that include them.

The next code listing is for the complete XmlValidator class.

XmlValidator.java (Complete Class Listing)
package dustin.examples.xmlvalidation;

import org.xml.sax.SAXException;

import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

import static java.lang.System.out;

/**
 * Validate provided XML against the provided XSDs.
 */
public class XmlValidator
{
   /**
    * Validate provided XML against the provided XSD schema files.
    *
    * @param xmlFilePathAndName Path/name of XML file to be validated;
    *    should not be null or empty.
    * @param xsdFilesPathsAndNames XSDs against which to validate the XML;
    *    should not be null or empty.
    */
   public static void validateXmlAgainstXsds(
      final String xmlFilePathAndName, final String[] xsdFilesPathsAndNames)
   {
      if (xmlFilePathAndName == null || xmlFilePathAndName.isEmpty())
      {
         out.println("ERROR: Path/name of XML to be validated cannot be null.");
         return;
      }
      if (xsdFilesPathsAndNames == null || xsdFilesPathsAndNames.length < 1)
      {
         out.println("ERROR: At least one XSD must be provided to validate XML against.");
         return;
      }
      final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

      final StreamSource[] xsdSources = generateStreamSourcesFromXsdPathsJdk8(xsdFilesPathsAndNames);

      try
      {
         final Schema schema = schemaFactory.newSchema(xsdSources);
         final Validator validator = schema.newValidator();
         out.println("Validating " + xmlFilePathAndName + " against XSDs "
            + Arrays.toString(xsdFilesPathsAndNames) + "...");
         validator.validate(new StreamSource(new File(xmlFilePathAndName)));
      }
      catch (IOException | SAXException exception)  // JDK 7 multi-exception catch
      {
         out.println(
            "ERROR: Unable to validate " + xmlFilePathAndName
            + " against XSDs " + Arrays.toString(xsdFilesPathsAndNames)
            + " - " + exception);
      }
      out.println("Validation process completed.");
   }

   /**
    * Generates array of StreamSource instances representing XSDs
    * associated with the file paths/names provided and use JDK 8
    * Stream API.
    *
    * This method can be commented out if using a version of
    * Java prior to JDK 8.
    *
    * @param xsdFilesPaths String representations of paths/names
    *    of XSD files.
    * @return StreamSource instances representing XSDs.
    */
   private static StreamSource[] generateStreamSourcesFromXsdPathsJdk8(
      final String[] xsdFilesPaths)
   {
      return Arrays.stream(xsdFilesPaths)
                   .map(StreamSource::new)
                   .collect(Collectors.toList())
                   .toArray(new StreamSource[xsdFilesPaths.length]);
   }

   /**
    * Generates array of StreamSource instances representing XSDs
    * associated with the file paths/names provided and uses
    * pre-JDK 8 Java APIs.
    *
    * This method can be commented out (or better yet, removed
    * altogether) if using JDK 8 or later.
    *
    * @param xsdFilesPaths String representations of paths/names
    *    of XSD files.
    * @return StreamSource instances representing XSDs.
    * @deprecated Use generateStreamSourcesFromXsdPathsJdk8 instead
    *    when JDK 8 or later is available.
    */
   @Deprecated
   private static StreamSource[] generateStreamSourcesFromXsdPathsJdk7(
      final String[] xsdFilesPaths)
   {
      // Diamond operator used here requires JDK 7; add type of
      // StreamSource to generic specification of ArrayList for
      // JDK 5 or JDK 6
      final List<StreamSource> streamSources = new ArrayList<>();
      for (final String xsdPath : xsdFilesPaths)
      {
         streamSources.add(new StreamSource(xsdPath));
      }
      return streamSources.toArray(new StreamSource[xsdFilesPaths.length]);
   }

   /**
    * Validates provided XML against provided XSD.
    *
    * @param arguments XML file to be validated (first argument) and
    *    XSD against which it should be validated (second and later
    *    arguments).
    */
   public static void main(final String[] arguments)
   {
      if (arguments.length < 2)
      {
         out.println("\nUSAGE: java XmlValidator <xmlFile> <xsdFile1> ... <xsdFileN>\n");
         out.println("\tOrder of XSDs can be significant (place XSDs that are");
         out.println("\tdependent on other XSDs after those they depend on)");
         System.exit(-1);
      }
      // Arrays.copyOfRange requires JDK 6; see
      // http://stackoverflow.com/questions/7970486/porting-arrays-copyofrange-from-java-6-to-java-5
      // for additional details for versions of Java prior to JDK 6.
      final String[] schemas = Arrays.copyOfRange(arguments, 1, arguments.length);
      validateXmlAgainstXsds(arguments[0], schemas);
   }
}

Despite what the length of this post might initially suggest, using Java to validate XML against an XSD is fairly straightforward. The sample application shown and explained here attempts to demonstrate that and is a useful tool for simple command line validation of XML documents against specified XSDs. One could easily port this to Groovy to be even more script-friendly. As mentioned earlier, this simple tool requires JDK 8 as currently written but could be easily adapted to work on JDK 5, JDK 6, or JDK 7.

UPDATE (20 March 2015): I have pushed the Java class shown in this post (XmlValidator.java) onto the GitHub repository dustinmarx/xmlutilities.

Friday, February 13, 2015

Writing Groovy's groovy.util.slurpersupport.GPathResult (XmlSlurper) Content as XML

In a previous blog post, I described using XmlNodePrinter to present XML parsed with XmlParser in a nice format to standard output, as a Java String, and in a new file. Because XmlNodePrinter works with groovy.util.Node instances, it works well with XmlParser, but doesn't work so well with XmlSlurper because XmlSlurper deals with instances of groovy.util.slurpersupport.GPathResult rather than instances of groovy.util.Node. This post looks at how groovy.xml.XmlUtil can be used to present GPathResult objects that results from slurping XML to standard output, as a Java String, and to a new file.

This post's first code listing demonstrates slurping XML with XmlSlurper and writing the slurped GPathResult to standard output using XmlUtil's serialize(GPathResult, OutputStream) method and the System.out handle.

slurpAndPrintXml.groovy : Writing XML to Standard Output
#!/usr/bin/env groovy

// slurpAndPrintXml.groovy
//
// Use Groovy's XmlSlurper to "slurp" provided XML file and use XmlUtil to write
// XML content out to standard output.

if (args.length < 1)
{
   println "USAGE: groovy slurpAndPrint.xml <xmlFile>"
}

String xmlFileName = args[0]

xml = new XmlSlurper().parse(xmlFileName)

import groovy.xml.XmlUtil
XmlUtil xmlUtil = new XmlUtil()
xmlUtil.serialize(xml, System.out)

The next code listing demonstrates use of XmlUtil's serialize(GPathResult) method to serialize the GPathResult to a Java String.

slurpXmlToString.groovy : Writing XML to Java String
#!/usr/bin/env groovy

// slurpXmlToString.groovy
//
// Use Groovy's XmlSlurper to "slurp" provided XML file and use XmlUtil to
// write the XML content to a String.

if (args.length < 1)
{
   println "USAGE: groovy slurpAndPrint.xml <xmlFile>"
}

String xmlFileName = args[0]

xml = new XmlSlurper().parse(xmlFileName)

import groovy.xml.XmlUtil
XmlUtil xmlUtil = new XmlUtil()
String xmlString = xmlUtil.serialize(xml)
println "String:\n${xmlString}"

The third code listing demonstrates use of XmlUtil's serialize(GPathResult, Writer) method to write the GPathResult representing the slurped XML to a file via a FileWriter instance.

slurpAndSaveXml.groovy : Writing XML to File
#!/usr/bin/env groovy

// slurpAndSaveXml.groovy
//
// Uses Groovy's XmlSlurper to "slurp" XML and then uses Groovy's XmlUtil
// to write slurped XML back out to a file with the provided name. The
// first argument this script expects is the path/name of the XML file to be
// slurped and the second argument expected by this script is the path/name of
// the file to which the XML should be saved/written.

if (args.length < 2)
{
   println "USAGE: groovy slurpAndSaveXml.groovy <xmlFile> <outputFile>"
}

String xmlFileName = args[0]
String outputFileName = args[1]
xml = new XmlSlurper().parse(xmlFileName)

import groovy.xml.XmlUtil
XmlUtil xmlUtil = new XmlUtil()
xmlUtil.serialize(xml, new FileWriter(new File(outputFileName)))

The examples in this post have demonstrated writing/serializing XML slurped into GPathResult objects through the use of XmlUtil methods. The XmlUtil class also provides methods for serializing the groovy.util.Node instances that XmlParser provides. The three methods accepting an instance of Node are similar to the three methods above for instances of GPathResult. Similarly, other methods of XmlUtil provide similar support for XML represented as instances of org.w3c.dom.Element, Java String, and groovy.lang.Writable.

The XmlNodePrinter class covered in my previous blog post can be used to serialize XmlParser's parsed XML represented as a Node. The XmlUtil class also can be used to serialize XmlParser's parsed XML represented as a Node but offers the additional advantage of being able to serialize XmlSlurper's slurped XML represented as a GPathResult.

Monday, February 9, 2015

Writing Groovy's groovy.util.Node (XmlParser) Content as XML

Groovy's XmlParser makes it easy to parse an XML file, XML input stream, or XML string using one its overloaded parse methods (or parseText in the case of the String). The XML content parsed with any of these methods is made available as a groovy.util.Node instance. This blog post describes how to make the return trip and write the content of the Node back to XML in alternative formats such as a File or a String.

Groovy's MarkupBuilder provides a convenient approach for generating XML from Groovy code. For example, I demonstrated writing XML based on SQL query results in the post GroovySql and MarkupBuilder: SQL-to-XML. However, when one wishes to write/serialize XML from a Groovy Node, an easy and appropriate approach is to use XmlNodePrinter as demonstrated in Updating XML with XmlParser.

The next code listing, parseAndPrintXml.groovy demonstrates use of XmlParser to parse XML from a provided file and use of XmlNodePrinter to write that Node parsed from the file to standard output as XML.

parseAndPrintXml.groovy : Writing XML to Standard Output
#!/usr/bin/env groovy

// parseAndPrintXml.groovy
//
// Uses Groovy's XmlParser to parse provided XML file and uses Groovy's
// XmlNodePrinter to print the contents of the Node parsed from the XML with
// XmlParser to standard output.

if (args.length < 1)
{
   println "USAGE: groovy parseAndPrintXml.groovy <XMLFile>"
   System.exit(-1)
}

XmlParser xmlParser = new XmlParser()
Node xml = xmlParser.parse(new File(args[0]))
XmlNodePrinter nodePrinter = new XmlNodePrinter(preserveWhitespace:true)
nodePrinter.print(xml)

Putting aside the comments and code for checking command line arguments, there are really 4 lines (lines 15-18) in the above code listing of significance to this discussion. These four lines demonstrate instantiating an XmlParser (line 15), using the instance of XmlParser to "parse" a File instance based on a provided argument file name (line 16), instantiating an XmlNodePrinter (line 17), and using that XmlNodePrinter instance to "print" the parsed XML to standard output (line 18).

Although writing XML to standard output can be useful for a user to review or to redirect output to another script or tool, there are times when it is more useful to have access to the parsed XML as a String. The next code listing is just a bit more involved than the last one and demonstrates use of XmlNodePrinter to write the parsed XML contained in an Node instance as a Java String.

parseXmlToString.groovy : Writing XML to Java String
#!/usr/bin/env groovy

// parseXmlToString.groovy
//
// Uses Groovy's XmlParser to parse provided XML file and uses Groovy's
// XmlNodePrinter to write the contents of the Node parsed from the XML with
// XmlParser into a Java String.

if (args.length < 1)
{
   println "USAGE: groovy parseXmlToString.groovy <XMLFile>"
   System.exit(-1)
}

XmlParser xmlParser = new XmlParser()
Node xml = xmlParser.parse(new File(args[0]))
StringWriter stringWriter = new StringWriter()
XmlNodePrinter nodePrinter = new XmlNodePrinter(new PrintWriter(stringWriter))
nodePrinter.setPreserveWhitespace(true)
nodePrinter.print(xml)
String xmlString = stringWriter.toString()
println "XML as String:\n${xmlString}"

As the just-shown code listing demonstrates, one can instantiate an instance of XmlNodePrinter that writes to a PrintWriter that was instantiated with a StringWriter. This StringWriter ultimately makes the XML available as a Java String.

Writing XML from a groovy.util.Node to a File is very similar to writing it to a String with a FileWriter used instead of a StringWriter. This is demonstrated in the next code listing.

parseAndSaveXml.groovy : Write XML to File
#!/usr/bin/env groovy

// parseAndSaveXml.groovy
//
// Uses Groovy's XmlParser to parse provided XML file and uses Groovy's
// XmlNodePrinter to write the contents of the Node parsed from the XML with
// XmlParser to file with provided name.

if (args.length < 2)
{
   println "USAGE: groovy parseAndSaveXml.groovy <sourceXMLFile> <targetXMLFile>"
   System.exit(-1)
}

XmlParser xmlParser = new XmlParser()
Node xml = xmlParser.parse(new File(args[0]))
FileWriter fileWriter = new FileWriter(args[1])
XmlNodePrinter nodePrinter = new XmlNodePrinter(new PrintWriter(fileWriter))
nodePrinter.setPreserveWhitespace(true)
nodePrinter.print(xml)

I don't show it in this post, but the value of being able to write a Node back out as XML often comes after modifying that Node instance. Updating XML with XmlParser demonstrates the type of functionality that can be performed on a Node before serializing the modified instance back out.

Monday, December 16, 2013

Searching Subversion Logs with Groovy

There are times when I want to quickly search a Subversion repository by author, by range of revisions, and/or by commit messages. Krzysztof Kotowicz has posted the blog post Grep Subversion log messages with svn-grep that introduces svn-grep, a bash script making use of the command line XML toolkit called xmlstarlet (xmlstarlet is also available on Windows). This is a pretty useful script in and of itself, but it gave me an idea for a Groovy-based script that could run on multiple (all JVM-supported) platforms.

searchSvnLog.groovy
#!/usr/bin/env groovy
//
// searchSvnLog.groovy
//
def cli = new CliBuilder(
   usage: 'searchSvnLog.groovy -r <revision1> -p <revision2> -a <author> -s <stringInMessage>')
import org.apache.commons.cli.Option
cli.with
{
   h(longOpt: 'help', 'Usage Information', required: false)
   r(longOpt: 'revision1', 'First SVN Revision', args: 1, required: false)
   p(longOpt: 'revision2', 'Last SVN Revision', args: 1, required: false)
   a(longOpt: 'author', 'Revision Author', args: 1, required: false)
   s(longOpt: 'search', 'Search String', args: 1, required: false)
   t(longOpt: 'target', 'SVN target directory/URL', args: 1, required: true)
}
def opt = cli.parse(args)

if (!opt) return
if (opt.h) cli.usage()

Integer revision1 = opt.r ? (opt.r as int) : null
Integer revision2 = opt.p ? (opt.p as int) : null
if (revision1 != null && revision2 != null && revision1 > revision2)
{
   println "It makes no sense to search for revisions ${revision1} through ${revision2}."
   System.exit(-1)
}
String author = opt.a ? (opt.a as String) : null
String search = opt.s ? (opt.s as String) : null
String logTarget = opt.t

String command = "svn log -r ${revision1 ?: 1} ${revision2 ?: 'HEAD'} ${logTarget} --xml"
def proc = command.execute()
StringBuilder standard = new StringBuilder()
StringBuilder error = new StringBuilder()
proc.waitForProcessOutput(standard, error)
def returnedCode = proc.exitValue() 
if (returnedCode != 0)
{
   println "ERROR: Returned code ${returnedCode}"
}

def xmlLogOutput = standard.toString()
def log = new XmlSlurper().parseText(xmlLogOutput)
def logEntries = new TreeMap<Integer, LogEntry>()
log.logentry.each
{ svnLogEntry ->
   Integer logRevision = Integer.valueOf(svnLogEntry.@revision as String)
   String message = svnLogEntry.msg as String
   String entryAuthor = svnLogEntry.author as String
   if (   (!revision1 || revision1 <= logRevision)
       && (!revision2 || revision2 >= logRevision)
       && (!author    || author == entryAuthor)
       && (!search    || message.toLowerCase().contains(search.toLowerCase()))
      )
   {
      def logEntry =
         new LogEntry(logRevision, svnLogEntry.author as String,
                      svnLogEntry.date as String, message)
      logEntries.put(logRevision, logEntry)
   }
}
logEntries.each
{ logEntryRevisionId, logEntry ->
   println "${logEntryRevisionId} : ${logEntry.author}/${logEntry.date} : ${logEntry.message}"
}

One thing that makes this script much easier to write is the ability of Subversion's log command to write its output in XML format with the --xml flag. Although XML has been the subject of significant criticism in recent years, one of the things I've liked about its availability is the widespread tool support for writing and reading XML. Subversion's ability to write certain types of output in XML is a good example of this. Without XML, the script would have required custom parsing code to be written to parse the non-standard SVN log output. Because Subversion supports writing to the standard XML format for its output, any XML-aware tool can read it. In this case, I leveraged Groovy's incredibly easy XML slurping (XML parsing) capability.

The script also uses Groovy's enhanced (GDK) Process class as I briefly described in my recent post Sublime Simplicity of Scripting with Groovy.

Groovy's built-in command-line support (CliBuilder) is used in the script to accept parameters for narrowing the search (such as applicable revisions, authors who committed, or strings to search the commit comments for). The one required parameter is the "target" which can be a file, directory, or URL.

The script references a Groovy class called LogEntry and the code listing for that class is shown next.

LogEntry.groovy
@groovy.transform.Canonical
class LogEntry
{
   int revision
   String author
   String date
   String message
}

That simple-looking LogEntry class is much more powerful than it might first appear. Because it's Groovy, there are automatically setter/getter methods available for the four attributes. Thanks to the @Canonical annotation, it also supports a constructor, equals, hashCode, and toString methods. In other words, this class of under ten lines total has accessor and mutator methods as well as common class methods overridden appropriately for it.

Conclusion

Groovy offers numerous features to make script writing easier. In this post, I used an example of "searching" Subversion commits via the Subversion log command (and its --xml option) to demonstrate some of these useful Groovy scripting features (command line parameter parsing, native operating system integration, and easy XML parsing). Along the way, some of Groovy's nice syntax advantages (closures, dynamic typing, GString value placeholders) were also used.

Saturday, July 20, 2013

Escaping XML with Groovy 2.1

When posting source code to my blog, I often need to convert less than signs (<), and greater than signs (>) to their respective entity references so that they are not confused as HTML tags when the browser renders the output. I have often done this using quick search-and-replace syntax like %s/</\&lt;/g and %s/>/\&gt;/g with vim or Perl. However, Groovy 2.1 introduced a method to do this and in this post I demonstrate a Groovy script that makes use of that groovy.xml.XmlUtil.escapeXml(String) method.

escapeXml.groovy
#!/usr/bin/env groovy
/*
 * escapeXml.groovy
 *
 * Requires Groovy 2.1 or later.
 */
if (args.length < 1)
{
   println "USAGE: groovy escapeXml.groovy <xmlFileToBeProcessed>"
   System.exit(-1)
}
def inputFileName = args[0]
println "Processing ${inputFileName}..."
def inputFile = new File(inputFileName)
String outputFileName = inputFileName + ".escaped"
def outputFile = new File(outputFileName)
if (outputFile.createNewFile())
{
   outputFile.text = groovy.xml.XmlUtil.escapeXml(inputFile.text)
}
else
{
   println "Unable to create file ${outputFileName}"
}

The XmlUtil.escapeXml method is intended to, as its GroovyDoc states, "escape the following characters " ' & < > with their XML entities." Running source code through it helps to convert symbols to XML entity references that will be rendered properly by the browser. This is particularly helpful with Java code that uses generics, for example.

The Groovydoc states that the following transformations from symbols to corresponding entity references are supported:

SymbolEntity
Reference
"&quot;
'&apos;
&&amp;
<&lt;
>&gt;

One of the advantages of this approach is that I can escape all five of these special symbols in an entire String or file with a single command rather than one symbol at a time.

The Groovydoc for this XmlUtil.escapeXml method also states things that this method does not do:

  • "Does not escape control characters" [use XmlUtil.escapeControlCharacters(String) for this]
  • "Does not support DTDs or external entities"
  • "Does not treat surrogate pairs specially"
  • "Does not perform Unicode validation on its input"

My example above showed a Groovy script file that makes use of XmlUtil.escapeXml(String), but it can also be run inline on the command-line. This is done in DOS, for example, as shown here:

type escapeXml.groovy | groovy -e "println groovy.xml.XmlUtil.escapeXml(System.in.text)"

That command just shown will take the provided file (escapeXml.groovy itself in this case) and render output with the specific symbols replaced with entity references. It could be handled the same way in Linux/Unix with "cat" rather than "type." This is shown in the next screen snapshot.

This blog post has shown how XmlUtil.escapeXml(String) can be used within a script or on the command-line to escape certain commonly problematic XML characters to their entity references. Although not shown here, one could embed such code within a Java application as well.

Thursday, August 25, 2011

NetBeans (7.0.1) Has An XML Schema Editor!

Just after clicking on the "Publish Post" button to publish my latest blog post (Adding Common Methods to JAXB-Generated Java Classes (JAXB2 Basics Plugins)), I saw Geertjan Wielenga's post XML Schema Editor in NetBeans IDE 7.0.1. The irony is that I had thought about looking for a NetBeans XSD editor plugin when writing my post on generating Java classes with common methods from XSD files using JAXB and JAXB2 Basic Plugins. However, because my XSD for the example was trivially simple, I simply used NetBeans's general XML-completion capabilities to help me generate the XSD for my example. In this blog post, I look at using the XML Schema Editor plugin mentioned in Geertjan's post with the XSD from my previous post.


Although I've been using NetBeans 7.0 for months now, I ran the update tool on it to start with and am now using NetBeans 7.0.1. I then followed Geertjan's instructions and registered the update center with the URI he provides (http://deadlock.netbeans.org/hudson/job/xml/lastSuccessfulBuild/artifact/build/updates/updates.xml). This is shown in the next screen snapshot.


Once Geertjan's specified update center is registered (I registered it under the name "NetBeans Deadlock"), the "XML Tools" plugin is available in the "Available Plugins" tab as shown in the next screen snapshot.


When I click on the "Install" button in the lower left corner, the NetBeans IDE Installer comes up. What's interesting about the "License Agreement" is that it lists a whole set of useful XML-related functions apparently supported by the plugin, including "XML Schema Support." This is shown in the next screen snapshot.


With the XML Tools plugin installed, it's now time to see how it looks with the XSD file used in my previous post. With the plugin installed, clicking on the XSD file's name in the "Files" window opens up three possible views ("Source", "Schema" and "Design"). The "Design" view is shown in the next screen snapshot.


There is a palette available for graphically designing an XML Schema Definition. This is much nicer than hand-typing it like I did. You can drag an attribute or other element from the palette over onto the design and then type in the appropriate name.

The "Tree View" of the Schema tab of the XSD file is shown in the next screen snapshot.


I like the "Tree View" for quickly ascertaining the hierarchical nature of the XSD. In the same "Schema" tab, the "Columns" view is also available as indicated in the next screen snapshot.


The "Validate XML" feature is also useful in the "Schema" tab. The results of clicking on the icon with two arrows pointing down is shown next.


I don't show the "Source" tab here because it's the standard source code editor window one has for XSDs without the plugin.

I probably would have not found this plugin even if I had looked for it, because I needed the recommendation to register the particular update center called out in Geertjan's post.

It is nice to have an XML Schema Editor in NetBeans. I don't manipulate XSD files very often, but this will make it ever easier and quicker to create and maintain them in the future when I need XSD files. This plugin for handling XML Schema Definitions is a welcome addition to NetBeans's XML support. Thanks, Geertjan, for the tip!

Tuesday, March 22, 2011

The New XML Stack in JDK 7

In the summary of new JDK 7 features, one of the categories is called Web and its major subcategory is Update the XML Stack. This support is available as of Milestone 12 (M12 AKA "Developer Preview" AKA "beta release") and is described as "Upgrade the JAXP, JAXB, and JAX-WS APIs to the most recent stable versions." In this post, I look at the versions of JAXP, JAXB, and JAX-WS associated with JDK 7 preview release (build 1.7.0-ea-b134).

In the article Better JPA, Better JAXB, and Better Annotations Processing with Java SE 6, I wrote about some of the advantages of Java SE 6 having JAXB 2.0 baked into it. It is not uncommon for future versions of Java to include newer versions of dependent libraries and JDK 7 updates JAXB from JAXB 2.1.10 (version since Java SE 6 Update 14) to JAXB 2.2.3 as shown in the next screen snapshot.


As the above screen snapshot shows, the xjc compiler is an easy command-line approach to determining the version of JAXB associated with a particular Java distribution (assuming that xjc is not on the path from a different location). The schemagen tool can also be used to determine the JAXB version.

It is similarly easy to determine the version of JAX-WS APIs supported in the Java SE 7 release by asking associated command line tools for their versions. The following screen snapshot demonstrates doing this with the tools wsgen and wsimport.


As indicated in the above screen snapshot, the JDK 1.7.0 b134 release has JAX-WS 2.2.2 associated with it (JAXB 2.1.6 was associated with Java 6 as of Java SE 6 Update 14).

I don't know of an equivalent method to those shown above to determine from the command line what version of JAXP is included with a particular Java distribution. Fortunately, the JDK 7 Documentation includes the JAXP page that states that "the Java Platform, Standard Edition version 7.0 includes JAXP 1.4" and explains that "JAXP 1.4 is a maintenance release of JAXP 1.3 with support for the Streaming API for XML (StAX)."

I expect that the anticipated Release Notes for JDK 7 will formally and conveniently list the versions of these XML-related products included with the JDK distribution.

Saturday, February 19, 2011

Recent Posts of Significant Interest (Java Security, XML, Cloud Computing)

I maintain a list of topics that I would like to blog on sometime in the future. This list continues to grow as I cannot blog quickly enough to keep up with the ideas. In some cases, I simply cannot write the full blog post I'd like in response to a really good blog post and, when enough of them are gathered up, I publish a post like this one that covers multiple blog posts at the same time. In this post, I reference recent online posts that I have really enjoyed on topics such as Java security issues, Javadoc, XML, and cloud computing.


JavaDoc: The unloved child. A pragmatic approach.

One of the things I think it done better in Java than in just about any other language I have used is documentation via Javadoc. I frequently use Javadoc-generated documentation for the Java SE, for the Java EE, and for other products in the Java ecosystem such as JFreeChart and Groovy (which has three!: GDK, Javadoc for Java Classes, and Groovydoc for Groovy and Java Classes). I find using others' Javadoc to generally make it easier to use their APIs (assuming correct documentation!) and I like the ability to have ready access to API documentation online and in my favorite IDE. I also enjoy being able to document how to use my APIs, packages, and classes, in package descriptions. This allows the clients of my APIs to see examples of how to use my API and nicely includes the documentation in the same area/file as the source code itself. In the post JavaDoc: The unloved child. A pragmatic approach., Markus Eisele concisely describes why Javadoc is useful and suggests some tips for making writing and maintaining of Javadoc comments more effective.


XML: Contrary to popular belief, it doesn't always kill babies

We software developers (as a whole) seem to be prone to violent swings between positions on things. Derek Thurn's post XML: Contrary to popular belief, it doesn't always kill babies does a nice job of pointing out how this happened with XML (loved and revered for a while and now not discussed in polite company). As with most of these extreme shifts, neither extreme was appropriate. XML was overhyped for a while and was used in many unnatural ways, but now developers in general (if the blogosphere is at all indicative) seem to wish to avoid XML without regard to the problem. I believe the appropriate position is somewhere in between. Thurn's post briefly discusses situations where alternatives like JSON and YAML are preferable to XML and the outlines two situations where he believes XML is appropriate (and I agree). I also like that Thurn states "XPath turned out to be a godsend" (something I have found as well in my work with XQuery and other things areas where XPath support is useful). It's also very difficult to argue with Thurn's use of SOAP as an example of "terrible things" people have done with XML.


10 Reasons to Say “No” to Cloud Computing?

I like posts that challenge the Lemming Effect. The previously discussed XML post challenges the potential lemming behavior of avoiding XML regardless of whether the situation warrants it or not and is an example of bucking the lemmings' general direction avoiding something (negative reaction). On the other side, it can be just as useful to avoid thoughtless following of lemmings to adopt something (positive reaction). I like 10 Reasons to Say “No” to Cloud Computing? because the author starts out with this:
I have been writing about the benefits of migrating to the Cloud in previous articles but it is also important to highlight in which circumstances the Cloud Computing route may not be the appropriate one.

When I first read this paragraph, I was afraid this post was going to be another one of those that would have ten reasons such as "you want to do things the hard way" and "you like a good challenge." Fortunately, this post turned out to be what was really advertised. The ten reasons are good ones to think about and I also appreciate the author's pointing out that "Cloud Computing is NOT an all-or-nothing decision." No tool or methodology can be all things to all people all the time and I suspect anyone who claims their favorite to do just that. It is much more useful to read from an evangelist of an approach about where that approach fits or does not fit and this post fits into that more useful category. Some people think cloud computing is not a good idea, but it (or portions of it) can be useful when applied correctly in the appropriate situations.


Java Security

Two recent articles of interest related to Java security are RSA: Java is the Most Vulnerable Browser Plug-in and Google extensions could aid Java security.

In "RSA: Java is the Most Vulnerable Browser Plug-in," Sean Michael Kerner reports that Qualys CTO Wolfgang Kandek stated that 42% of monitored web users had "vulnerable out-of-date" Java plug-ins and that this plug-in was the most frequently out-of-date and vulnerable of those measured. Other plug-ins that were vulnerable due to being frequently out of date include Adobe Reader, Apple QuickTime, and Adobe Flash. As I read this, the ranking is not of how vulnerable one plug-in is as compared to another, but how vulnerable they are because they are out-of-date. Kerner also points out that Cisco had reported that Java vulnerabilities are now more exploited than those in Adobe Acrobat and Reader.

Joab Jackson writes that Google Contracts (Contracts for Java or cofoja), which is often advertised as and thought of as an approach for making it easier to appropriately use methods (above and beyond what the previously mentioned Javadoc provides), can also help make Java code more secure. He states that this project based off of Modern Jass can provide some of the same security benefits to Java that Eiffel developers claim are inherent in Eiffel's support for Design by Contract (DbC).


Conclusion

There are far too many interesting and insightful posts about software development to keep up with all of them. In this post, I've tried to summarize and publicize some recent posts that I believe are worth a look if you have not read them already.

Saturday, February 5, 2011

Generating XML Schema with schemagen and Groovy

I have previously blogged on several utilitarian tools that are provided with the Java SE 6 HotSpot SDK such as jstack, javap, and so forth. I focus on another tool in the same $JAVA_HOME/bin (or %JAVA_HOME%\bin directory: schemagen. Although schemagen is typically used in conjunction with web services and/or JAXB, it can be useful in other contexts as well. Specifically, it can be used as an easy way to create a starting point XML Schema Definition (XSD) for someone who is more comfortable with Java than with XML Schema.

We'll begin with a simple Java class called Person to demonstrate the utility of schemagen. This is shown in the next code listing.

package dustin.examples;

public class Person
{
   private String lastName;

   private String firstName;

   private char middleInitial;

   private String identifier;

   /**
    * No-arguments constructor required for 'schemagen' to create XSD from
    * this Java class.  Without this "no-arg default constructor," this error
    * message will be displayed when 'schemagen' is attempted against it:
    *
    *      error: dustin.examples.Person does not have a no-arg default
    *      constructor.
    */
   public Person() {}

   public Person(final String newLastName, final String newFirstName)
   {
      this.lastName = newLastName;
      this.firstName = newFirstName;
   }

   public Person(
      final String newLastName,
      final String newFirstName,
      final char newMiddleInitial)
   {
      this.lastName = newLastName;
      this.firstName = newFirstName;
      this.middleInitial = newMiddleInitial;
   }

   public String getLastName()
   {
      return this.lastName;
   }

   public void setLastName(final String newLastName)
   {
      this.lastName = newLastName;
   }

   public String getFirstName()
   {
      return this.firstName;
   }

   public void setFirstName(final String newFirstName)
   {
      this.firstName = newFirstName;
   }

   public char getMiddleInitial()
   {
      return this.middleInitial;
   }
}

The class above is very simple, but is adequate for the first example of employing schemagen. As the comment on the no-arguments constructor in the above code states, a constructor without arguments (sometimes called a "default constructor") must be available in the class. Because other constructors are in this class, it is required that a no-args constructor be explicitly specified. I also intentionally provided get/set (accesor/mutator) methods for some of the fields, only an accessor for one of the fields, and neither for a field to demonstrate that schemagen requires get/set methods to be specified if the schema it generates includes a reference to those attributes.

The next screen snapshot demonstrates the most simple use of schemagen in which the generated XML schema file (.xsd) is generated with the default name of schema1.xsd (there is no current way to control this directly with schemagen) and is placed in the same directory from which the schemagen command is run (output location can be dictated with the -d option).


The generated XSD is shown next.

schema1.xsd
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person">
    <xs:sequence>
      <xs:element name="firstName" type="xs:string" minOccurs="0"/>
      <xs:element name="lastName" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

This is pretty convenient, but is even easier with Groovy. Suppose that one wanted to generate an XSD using schemagen and did not care about or need the original Java class. The following very simple Groovy class could be used. Very little effort is required to write this, but it's compiled .class file can be used with schemagen.

package dustin.examples;

public class Person2
{
   String lastName;

   String firstName;

   char middleInitial;

   String identifier;
}

When the above Groovy class is compiled with groovyc, its resulting Person2.class file can be viewed through another useful tool (javap) located in the same directory as schemagen. This is shown in the next screen snapshot. The most important observation is that get/set methods have been automatically generated by Groovy.


When the groovyc-generated .class file is run through schemagen, the XSD is generated and is shown next.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person2">
    <xs:sequence>
      <xs:element name="firstName" type="xs:string" minOccurs="0"/>
      <xs:element name="identifier" type="xs:string" minOccurs="0"/>
      <xs:element name="lastName" type="xs:string" minOccurs="0"/>
      <xs:element name="middleInitial" type="xs:unsignedShort"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Because I did not explicitly state that Groovy's automatic get/set methods should not be applied, all attributes are represented in the XML. Very little Groovy, but XSD nonetheless.

It is interesting to see what happens when the attributes of the Groovy class are untyped. The next Groovy class listing does not explicitly type the class attributes.

package dustin.examples;

public class Person2
{
   def lastName;

   def firstName;

   def middleInitial;

   def identifier;
}

When schemagen is run against the above class with untyped attributes, the output XSD looks like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person2">
    <xs:sequence>
      <xs:element name="firstName" type="xs:anyType" minOccurs="0"/>
      <xs:element name="identifier" type="xs:anyType" minOccurs="0"/>
      <xs:element name="lastName" type="xs:anyType" minOccurs="0"/>
      <xs:element name="middleInitial" type="xs:anyType" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Not surprisingly, the Groovy class with the untyped attributes leads to an XSD with elements of anyType. It is remarkably easy to generate Schema with schemagen from a Groovy class, but what if I don't want an attribute of the class to be part of the generated schema? Explicitly specifying an attribute as private communicates to Groovy to not automatically generate get/set methods and hence schemagen will not generate XSD elements for those attributes. The next Groovy class shows two attributes explicitly defined as private and the resultant XSD from running schemagen against the compiled Groovy class is then shown.

package dustin.examples;

public class Person2
{
   String lastName;

   String firstName;

   /** private modifier prevents auto Groovy set/get methods */
   private String middleInitial;

   /** private modifier prevents auto Groovy set/get methods */
   private String identifier;
}

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="person2">
    <xs:sequence>
      <xs:element name="firstName" type="xs:string" minOccurs="0"/>
      <xs:element name="lastName" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Groovy makes it really easy to generate an XSD. The Groovy code required to do so is barely more than a list of attributes and their data types.


Conclusion

The schemagen tool is a highly useful tool most commonly used in conjunction with web services and with JAXB, but I have found several instances where I have needed to create a "quick and dirty" XSD file for a variety of purposes. Taking advantage of Groovy's automatically generated set/get methods and other Groovy conciseness makes it really easy to generate a simple XSD.