Showing posts with label OpenXML4J. Show all posts
Showing posts with label OpenXML4J. Show all posts

Monday, February 11, 2008

Using OpenXML4J to Access Office Open XML Parts

In a previous blog entry, I wrote about accessing Office 2007 file properties using OpenXML4J. In this blog entry, I intend to demonstrate how easy it is to access the parts that make up an Office Open XML document in Java using OpenXML4J.

The code that uses OpenXML4J to access parts of an Office 2007 document is very straightforward and is shown next.


package marx.openxml4j;

import java.util.List;

import org.openxml4j.exceptions.InvalidFormatException;
import org.openxml4j.opc.Package;
import org.openxml4j.opc.PackageAccess;
import org.openxml4j.opc.PackagePart;

/**
* Example demonstrating how to use OpenXML4J to extract parts from Office
* 2007 formatted files (Excel, Word, and PowerPoint).
*
* For other examples and for additional background on the Office Open XML
* format see:
* <ul>
* <li><a href="http://www.openxml4j.org/Documentation/Tutorials/OPCSamples.html"
* target="_blank">OpenXML4J Tutorial</a></li>
* <li><a href="http://openxmldeveloper.org/articles/OpenXMLandJava.aspx"
* target="_blank">OpenXML and Java</a></li>
* <li><a href="http://www.ecma-international.org/news/TC45_current_work/OpenXML%20White%20Paper.pdf"
* target="_blank">Office Open XML Overview</li>
* </ul>
*/
public class OpenXML4JExample
{
/**
* Display package parts that make up the provided Office 2007 compatible file.
*
* @param aFilePathAndName Path and name of Office 2007 compatible file.
*/
public void displayPackageParts(final String aFilePathAndName)
{
try
{
Package pkg = Package.open(aFilePathAndName, PackageAccess.READ);
List<PackagePart> parts = pkg.getParts();
for ( PackagePart packagePart : parts )
{
System.out.println(" PackagePart: " + packagePart.toString() );
}
}
catch (InvalidFormatException invalidFormatEx)
{
System.err.println("Invalid format" + invalidFormatEx.getMessage());
}
}

/**
* Print a separator to stdout to delineate example file being run.
*
* @param aSeparatorTitle Title to place in separator header.
*/
public static void printSeparatorHeader(final String aSeparatorTitle)
{
System.out.println(
"-------------------------------------------------------------------");
System.out.println(
"-- " + aSeparatorTitle );
System.out.println(
"-------------------------------------------------------------------");
}

/**
* Run test showing how to extract Office 2007 format properties from a
* PowerPoint presentation, a Word document, and an Excel spreadsheet using
* the OpenXML4J library (currently in Alpha).
*
* @param aCommandLineArgs Command-line arguments.
*/
public static void main(final String[] aCommandLineArgs)
{
String officeFilePathAndName;
OpenXML4JExample me = new OpenXML4JExample();

// Example Microsoft Office 2007 PowerPoint presentation
officeFilePathAndName = "C:\\sample2007\\marx-poi.pptx";
printSeparatorHeader("PowerPoint Example: " + officeFilePathAndName);
me.displayPackageParts(officeFilePathAndName);

// Example Microsoft Office 2007 Word document
officeFilePathAndName = "C:\\sample2007\\marx-poi.docx";
printSeparatorHeader("Word Example: " + officeFilePathAndName);
me.displayPackageParts(officeFilePathAndName);

// Example Microsoft Office 2007 Excel spreadsheet
officeFilePathAndName = "C:\\sample2007\\2008Conferences.xlsx";
printSeparatorHeader("Excel Example: " + officeFilePathAndName);
me.displayPackageParts(officeFilePathAndName);
}
}


The results from running this code on the files hard-coded in the source code are shown next:


-------------------------------------------------------------------
-- PowerPoint Example: C:\sample2007\marx-poi.pptx
-------------------------------------------------------------------
PackagePart: Name: /_rels/.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /docProps/app.xml - Content Type: application/vnd.openxmlformats-officedocument.extended-properties+xml
PackagePart: Name: /docProps/core.xml - Content Type: application/vnd.openxmlformats-package.core-properties+xml
PackagePart: Name: /docProps/thumbnail.wmf - Content Type: image/x-wmf
PackagePart: Name: /ppt/_rels/presentation.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/media/image1.jpeg - Content Type: image/jpeg
PackagePart: Name: /ppt/media/image2.jpeg - Content Type: image/jpeg
PackagePart: Name: /ppt/media/image3.jpeg - Content Type: image/jpeg
PackagePart: Name: /ppt/media/image4.jpeg - Content Type: image/jpeg
PackagePart: Name: /ppt/media/image5.png - Content Type: image/png
PackagePart: Name: /ppt/media/image6.png - Content Type: image/png
PackagePart: Name: /ppt/media/image7.png - Content Type: image/png
PackagePart: Name: /ppt/media/image8.wmf - Content Type: image/x-wmf
PackagePart: Name: /ppt/notesMasters/_rels/notesMaster1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesMasters/notesMaster1.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesMaster+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide10.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide11.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide12.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide13.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide14.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide15.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide16.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide17.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide18.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide19.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide2.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide20.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide21.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide22.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide23.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide24.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide25.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide26.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide27.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide28.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide29.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide3.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide30.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide31.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide32.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide33.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide34.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide35.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide36.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide37.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide4.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide5.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide6.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide7.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide8.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/_rels/notesSlide9.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/notesSlides/notesSlide1.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide10.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide11.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide12.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide13.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide14.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide15.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide16.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide17.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide18.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide19.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide2.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide20.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide21.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide22.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide23.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide24.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide25.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide26.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide27.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide28.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide29.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide3.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide30.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide31.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide32.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide33.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide34.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide35.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide36.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide37.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide4.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide5.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide6.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide7.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide8.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/notesSlides/notesSlide9.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.notesSlide+xml
PackagePart: Name: /ppt/presentation.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.presentation.main+xml
PackagePart: Name: /ppt/presProps.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.presProps+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout10.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout11.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout12.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout2.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout3.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout4.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout5.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout6.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout7.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout8.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/_rels/slideLayout9.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout1.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout10.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout11.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout12.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout2.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout3.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout4.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout5.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout6.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout7.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout8.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideLayouts/slideLayout9.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml
PackagePart: Name: /ppt/slideMasters/_rels/slideMaster1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slideMasters/slideMaster1.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slideMaster+xml
PackagePart: Name: /ppt/slides/_rels/slide1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide10.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide11.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide12.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide13.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide14.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide15.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide16.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide17.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide18.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide19.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide2.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide20.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide21.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide22.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide23.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide24.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide25.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide26.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide27.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide28.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide29.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide3.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide30.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide31.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide32.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide33.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide34.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide35.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide36.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide37.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide4.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide5.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide6.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide7.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide8.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/_rels/slide9.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /ppt/slides/slide1.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide10.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide11.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide12.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide13.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide14.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide15.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide16.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide17.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide18.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide19.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide2.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide20.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide21.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide22.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide23.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide24.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide25.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide26.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide27.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide28.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide29.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide3.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide30.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide31.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide32.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide33.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide34.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide35.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide36.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide37.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide4.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide5.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide6.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide7.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide8.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/slides/slide9.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.slide+xml
PackagePart: Name: /ppt/tableStyles.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.tableStyles+xml
PackagePart: Name: /ppt/theme/theme1.xml - Content Type: application/vnd.openxmlformats-officedocument.theme+xml
PackagePart: Name: /ppt/theme/theme2.xml - Content Type: application/vnd.openxmlformats-officedocument.theme+xml
PackagePart: Name: /ppt/viewProps.xml - Content Type: application/vnd.openxmlformats-officedocument.presentationml.viewProps+xml
-------------------------------------------------------------------
-- Word Example: C:\sample2007\marx-poi.docx
-------------------------------------------------------------------
PackagePart: Name: /_rels/.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /docProps/app.xml - Content Type: application/vnd.openxmlformats-officedocument.extended-properties+xml
PackagePart: Name: /docProps/core.xml - Content Type: application/vnd.openxmlformats-package.core-properties+xml
PackagePart: Name: /word/_rels/document.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /word/_rels/footer1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /word/_rels/footer2.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /word/_rels/settings.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /word/document.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml
PackagePart: Name: /word/endnotes.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.endnotes+xml
PackagePart: Name: /word/fontTable.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml
PackagePart: Name: /word/footer1.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.footer+xml
PackagePart: Name: /word/footer2.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.footer+xml
PackagePart: Name: /word/footnotes.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.footnotes+xml
PackagePart: Name: /word/header1.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.header+xml
PackagePart: Name: /word/media/image1.png - Content Type: image/png
PackagePart: Name: /word/media/image2.png - Content Type: image/png
PackagePart: Name: /word/numbering.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml
PackagePart: Name: /word/settings.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml
PackagePart: Name: /word/styles.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml
PackagePart: Name: /word/theme/theme1.xml - Content Type: application/vnd.openxmlformats-officedocument.theme+xml
PackagePart: Name: /word/webSettings.xml - Content Type: application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml
-------------------------------------------------------------------
-- Excel Example: C:\sample2007\2008Conferences.xlsx
-------------------------------------------------------------------
PackagePart: Name: /_rels/.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /docProps/app.xml - Content Type: application/vnd.openxmlformats-officedocument.extended-properties+xml
PackagePart: Name: /docProps/core.xml - Content Type: application/vnd.openxmlformats-package.core-properties+xml
PackagePart: Name: /xl/_rels/workbook.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /xl/comments1.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.comments+xml
PackagePart: Name: /xl/drawings/vmlDrawing1.vml - Content Type: application/vnd.openxmlformats-officedocument.vmldrawing
PackagePart: Name: /xl/printerSettings/printerSettings1.bin - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.printersettings
PackagePart: Name: /xl/sharedStrings.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sharedStrings+xml
PackagePart: Name: /xl/styles.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.styles+xml
PackagePart: Name: /xl/theme/theme1.xml - Content Type: application/vnd.openxmlformats-officedocument.theme+xml
PackagePart: Name: /xl/workbook.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml
PackagePart: Name: /xl/worksheets/_rels/sheet1.xml.rels - Content Type: application/vnd.openxmlformats-package.relationships+xml
PackagePart: Name: /xl/worksheets/sheet1.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml
PackagePart: Name: /xl/worksheets/sheet2.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml
PackagePart: Name: /xl/worksheets/sheet3.xml - Content Type: application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml


The parts returned via OpenXML4J will look eerily similar to the parts of the same Office 2007 Excel document when its parts are displayed using the command jar tvf as shown in the next image (click on image to see larger version).



OpenXML4J provides an easy-to-use Java API for obtaining the parts that constitute an Office 2007 file.

Monday, February 4, 2008

Will Apache POI Support Office Open XML Format?

As I recorded in my last blog entry, OpenXML4J is a Java library under development to open, access, and modify Office 2007 applications' data files in Office Open XML format. I also mentioned that Apache POI has provided useful Excel manipulation support in Java for versions of Excel prior to Excel 2007, but its HSSF library does not support Office Open XML formats (ECMA-376).

It looks like the Apache POI team has started working on support for Office 2007 applications. This work is often referenced as OOXML support for Apache POI. It sounds like they have a nice initial design involving a single interface with two different implementations, one for Office 2007 and one for Office applications pre-dating Office 2007. Another potential promising development is the talk of the Apache POI team and the OpenXML4J team working together. It would be nice to have one area of Java where we don't have multiple and completely disparate APIs and frameworks for doing the same thing.

The article Using Java to Crack Office 2007 shows how one might use Java directly to access Office 2007 files. While Office Open XML format is significantly more approachable than the previous formats used in Office applications, this article demonstrates that it would still be nice to have a library to abstract away some of the details. This is what I hope future versions of Apache POI, or OpenXML4J, or preferably both with the same interchangeable API will offer us in the near future.

Here are some other interesting web links to additional information regarding Apache POI support for Microsoft Office 2007 applications.

Design of OOXML Support Code (16 January 2008)

Initial OOXML Support (30 December 2007)

Using POI with Excel 2007 (30 May 2007)

Fun with Interfaces (Problems in pre-J2SE 5)

Saturday, February 2, 2008

OpenXML4J: Reading Office 2007 File Properties

Happy Groundhog Day (movie)!

Apache POI is an excellent library for reading, writing, and manipulating Excel spreadsheets. Apache POI's HSSF ("Horrible Spreadsheet Format") support is particularly good and useful. The most significant disadvantage of Apache POI is that it does not support the new format used with Microsoft Office 2007 applications.

For a while, the disadvantage of Office 2007's new file format will not be as large on the effectiveness of Apache POI as it might be because of the following reasons:

  • Not everyone has migrated to Office 2007 and so there are still many earlier versions of Office products in existence and new files are still being created in the older POI-supported format.

  • Those who have migrated to Office 2007 can still save their files in the old format when necessary to share with others or for POI's use. Likewise, they canalso read the old format and thus read POI-generated files.

  • Files that are created with POI can be opened up in an earlier version of Office products and converted to Office 2007 format using the freely provided converter.



However, as time goes on, the ability to read, write, and manipulate Office 2007 files in Java will become more desirable and eventually a necessity. While the new format used in Microsoft 2007 applications (Office Open XML - ECMA-376) is simply composed of XML files zipped up in a zip file, any Java application accessing these file types would need to extract or build the zip file and would also need to understand the underlying XML format. To answer this need, work has begun on an open source library called OpenXML4J. This library recently transitioned from pre-Alpha to Alpha status (on 21 January 2008). The license for this library is currently your choice of BSD license or Apache v2 license.

In this blog entry, I will focus on using OpenXML4J to extract properties from Office 2007 application files. The code sample that will be shown below will extract properties from three types of Office 2007 files (PowerPoint, Word, and Excel files). As a lazy developer, I chose to use some existing files for this. I use Office 2003, but I was able to save these three files in the Office Open XML format in PowerPoint, Word, and Excel using the downloaded Microsoft conversion pack. For the PowerPoint and Word examples, I saved my presentation and paper (respectively) for RMOUG Training Days 2008 ("Excel with Apache POI and Oracle Database"). For the Excel example, I am using a simple spreadsheet containing important dates for submitting papers, submitting presentations, and actually presenting the materials for various conferences this spring. For this example, the contents of the various files are not all that important except for the Microsoft Office properties associated with each.

The next three screen shots show the properties window for the three application data files. The contents of these windows are important because they are what we will be extracting directly from the file using Java and OpenXML4J. As with all images in this blog, please click on any image to see a larger version of the image.

PowerPoint: marx-poi.pptx Properties



Word: marx-poi.docx Properties



Excel: 2008Conferences.xlsx Properties



Before moving onto the example that demonstrates use of OpenXML4J to extract properties from Office 2007 application data files, I will first demonstrate the XML contents of the new Office Open XML formats. In the following three examples, I have used Java's jar command (jar tvf) to display the contents of these files.

Contents of marx-poi.pptx

13756 Tue Jan 01 00:00:00 MST 1980 [Content_Types].xml
737 Tue Jan 01 00:00:00 MST 1980 _rels/.rels
598 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide11.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide12.xml.rels
598 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide13.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide15.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide10.xml.rels
598 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide16.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide17.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide14.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide8.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide18.xml.rels
909 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide1.xml.rels
462 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide2.xml.rels
462 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide3.xml.rels
598 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide4.xml.rels
462 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide5.xml.rels
462 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide6.xml.rels
641 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide7.xml.rels
462 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide9.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide20.xml.rels
5977 Tue Jan 01 00:00:00 MST 1980 ppt/_rels/presentation.xml.rels
829 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide31.xml.rels
839 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide32.xml.rels
882 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide33.xml.rels
1263 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide34.xml.rels
1080 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide35.xml.rels
893 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide36.xml.rels
1237 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide37.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide30.xml.rels
1025 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide29.xml.rels
951 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide28.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide21.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide22.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide23.xml.rels
648 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide24.xml.rels
688 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide25.xml.rels
659 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide26.xml.rels
647 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide27.xml.rels
463 Tue Jan 01 00:00:00 MST 1980 ppt/slides/_rels/slide19.xml.rels
4714 Tue Jan 01 00:00:00 MST 1980 ppt/presentation.xml
2750 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide26.xml
3103 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide30.xml
2200 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide16.xml
6513 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide15.xml
2682 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide31.xml
5865 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide14.xml
2206 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide13.xml
3046 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide25.xml
13352 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide12.xml
6083 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide17.xml
5929 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide18.xml
4080 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide29.xml
3007 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide24.xml
5206 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide23.xml
4207 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide27.xml
4579 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide22.xml
3483 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide21.xml
4177 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide28.xml
6277 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide20.xml
7659 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide19.xml
7534 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide11.xml
4180 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide32.xml
4283 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide4.xml
3439 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide35.xml
3378 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide6.xml
2118 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide2.xml
3049 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide33.xml
3335 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide5.xml
2754 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide36.xml
3992 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide37.xml
2806 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide3.xml
2955 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide7.xml
3515 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide9.xml
2836 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide10.xml
3605 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide1.xml
35794 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide8.xml
4031 Tue Jan 01 00:00:00 MST 1980 ppt/slides/slide34.xml
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide11.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide10.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout12.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide37.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide5.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide2.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide9.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide4.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide6.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide3.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide7.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide8.xml.rels
447 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide1.xml.rels
20729 Tue Jan 01 00:00:00 MST 1980 ppt/slideMasters/slideMaster1.xml
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide13.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide30.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide29.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide28.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide27.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide31.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide32.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide33.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide34.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide35.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide36.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide26.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide25.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide24.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide17.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide16.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide15.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide14.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide18.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide19.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide20.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide21.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide22.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide23.xml.rels
448 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/_rels/notesSlide12.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout11.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout9.xml.rels
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide11.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide10.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide9.xml
2337 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide8.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide7.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide6.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide12.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide13.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide14.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide19.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide18.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide17.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide16.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide15.xml
1696 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide5.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide4.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide3.xml
1736 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout6.xml
6603 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout5.xml
4030 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout4.xml
3076 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout3.xml
2376 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout2.xml
7618 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout1.xml
1439 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout7.xml
4188 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout8.xml
4164 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout9.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide2.xml
1616 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide1.xml
2203 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout12.xml
2656 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout11.xml
2431 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/slideLayout10.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide20.xml
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout10.xml.rels
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide22.xml
2284 Tue Jan 01 00:00:00 MST 1980 ppt/slideMasters/_rels/slideMaster1.xml.rels
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide37.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide36.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide35.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide34.xml
447 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout1.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout2.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout3.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout8.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout7.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout6.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout5.xml.rels
311 Tue Jan 01 00:00:00 MST 1980 ppt/slideLayouts/_rels/slideLayout4.xml.rels
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide33.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide21.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide31.xml
1734 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide27.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide26.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide25.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide24.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide23.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide28.xml
2685 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide32.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide29.xml
1617 Tue Jan 01 00:00:00 MST 1980 ppt/notesSlides/notesSlide30.xml
521299 Tue Jan 01 00:00:00 MST 1980 ppt/media/image2.jpeg
20941 Tue Jan 01 00:00:00 MST 1980 ppt/media/image7.png
1557708 Tue Jan 01 00:00:00 MST 1980 ppt/media/image4.jpeg
114888 Tue Jan 01 00:00:00 MST 1980 ppt/media/image5.png
292 Tue Jan 01 00:00:00 MST 1980 ppt/notesMasters/_rels/notesMaster1.xml.rels
9241 Tue Jan 01 00:00:00 MST 1980 ppt/theme/theme1.xml
69148 Tue Jan 01 00:00:00 MST 1980 ppt/media/image3.jpeg
43096 Tue Jan 01 00:00:00 MST 1980 docProps/thumbnail.wmf
25927 Tue Jan 01 00:00:00 MST 1980 ppt/media/image6.png
6962 Tue Jan 01 00:00:00 MST 1980 ppt/theme/theme2.xml
93711 Tue Jan 01 00:00:00 MST 1980 ppt/media/image1.jpeg
10986 Tue Jan 01 00:00:00 MST 1980 ppt/media/image8.wmf
9070 Tue Jan 01 00:00:00 MST 1980 ppt/notesMasters/notesMaster1.xml
376 Tue Jan 01 00:00:00 MST 1980 ppt/presProps.xml
1045 Tue Jan 01 00:00:00 MST 1980 ppt/viewProps.xml
182 Tue Jan 01 00:00:00 MST 1980 ppt/tableStyles.xml
986 Tue Jan 01 00:00:00 MST 1980 docProps/core.xml
3736 Tue Jan 01 00:00:00 MST 1980 docProps/app.xml


Contents of marx-poi.docx

2143 Tue Jan 01 00:00:00 MST 1980 [Content_Types].xml
590 Tue Jan 01 00:00:00 MST 1980 _rels/.rels
10765 Tue Jan 01 00:00:00 MST 1980 word/_rels/document.xml.rels
135202 Tue Jan 01 00:00:00 MST 1980 word/document.xml
319 Tue Jan 01 00:00:00 MST 1980 word/_rels/footer2.xml.rels
887 Tue Jan 01 00:00:00 MST 1980 word/footer2.xml
939 Tue Jan 01 00:00:00 MST 1980 word/footer1.xml
319 Tue Jan 01 00:00:00 MST 1980 word/_rels/footer1.xml.rels
822 Tue Jan 01 00:00:00 MST 1980 word/header1.xml
946 Tue Jan 01 00:00:00 MST 1980 word/endnotes.xml
952 Tue Jan 01 00:00:00 MST 1980 word/footnotes.xml
2751 Tue Jan 01 00:00:00 MST 1980 word/media/image1.png
6992 Tue Jan 01 00:00:00 MST 1980 word/theme/theme1.xml
122008 Tue Jan 01 00:00:00 MST 1980 word/media/image2.png
356 Tue Jan 01 00:00:00 MST 1980 word/_rels/settings.xml.rels
6519 Tue Jan 01 00:00:00 MST 1980 word/settings.xml
654 Tue Jan 01 00:00:00 MST 1980 word/webSettings.xml
1828 Tue Jan 01 00:00:00 MST 1980 word/fontTable.xml
750 Tue Jan 01 00:00:00 MST 1980 docProps/app.xml
1089 Tue Jan 01 00:00:00 MST 1980 docProps/core.xml
1680 Tue Jan 01 00:00:00 MST 1980 word/numbering.xml
23802 Tue Jan 01 00:00:00 MST 1980 word/styles.xml


Contents of 2008Conferences.xlsx

1780 Tue Jan 01 00:00:00 MST 1980 [Content_Types].xml
588 Tue Jan 01 00:00:00 MST 1980 _rels/.rels
980 Tue Jan 01 00:00:00 MST 1980 xl/_rels/workbook.xml.rels
625 Tue Jan 01 00:00:00 MST 1980 xl/workbook.xml
6995 Tue Jan 01 00:00:00 MST 1980 xl/theme/theme1.xml
605 Tue Jan 01 00:00:00 MST 1980 xl/worksheets/_rels/sheet1.xml.rels
518 Tue Jan 01 00:00:00 MST 1980 xl/worksheets/sheet2.xml
518 Tue Jan 01 00:00:00 MST 1980 xl/worksheets/sheet3.xml
1882 Tue Jan 01 00:00:00 MST 1980 xl/drawings/vmlDrawing1.vml
1748 Tue Jan 01 00:00:00 MST 1980 xl/sharedStrings.xml
3998 Tue Jan 01 00:00:00 MST 1980 xl/styles.xml
3530 Tue Jan 01 00:00:00 MST 1980 xl/worksheets/sheet1.xml
994 Tue Jan 01 00:00:00 MST 1980 xl/comments1.xml
364 Tue Jan 01 00:00:00 MST 1980 xl/printerSettings/printerSettings1.bin
1028 Tue Jan 01 00:00:00 MST 1980 docProps/core.xml
866 Tue Jan 01 00:00:00 MST 1980 docProps/app.xml


All of the above has been background information on the new Office Open XML format employed by Office 2007. Now, here is the Java code that uses OpenXML4J to extract the properties from a PowerPoint presentation, a Word document, and an Excel spreadsheet.


package marx.openxml4j;

import org.openxml4j.exceptions.InvalidFormatException;

import org.openxml4j.opc.Package;
import org.openxml4j.opc.PackageAccess;
import org.openxml4j.opc.PackageProperties;

import org.openxml4j.util.Nullable;

/**
* Example demonstrating how to use OpenXML4J to read properties from Office
* 2007 formatted files (Excel, Word, and PowerPoint).
*/
public class OpenXML4JExample
{
/**
* Read an Excel document with the Office Open XML format used for Office
* 2007 products.
*
* @param aFilePathAndName
*/
public void readOfficeDocumentProperties(final String aFilePathAndName)
{
try
{
Package pkg = Package.open(aFilePathAndName, PackageAccess.READ);
PackageProperties properties = pkg.getPackageProperties();
final Nullable<String> title = properties.getTitleProperty();
final Nullable<String> language = properties.getLanguageProperty();
final Nullable<String> category = properties.getCategoryProperty();
final Nullable<String> keywords = properties.getKeywordsProperty();
final Nullable<String> subject = properties.getSubjectProperty();
final Nullable<String> description = properties.getDescriptionProperty();
final Nullable<String> version = properties.getVersionProperty();
final Nullable<String> creator = properties.getCreatorProperty();
if ( title.hasValue() )
{
System.out.println( aFilePathAndName + " has a title of '"
+ title.getValue() + "'");
}
if ( language.hasValue() )
{
System.out.println( aFilePathAndName + " uses the language "
+ language.getValue() );
}
if ( category.hasValue() )
{
System.out.println( aFilePathAndName + " is in the category of '"
+ category.getValue() + "'");
}
if ( keywords.hasValue() )
{
System.out.println( aFilePathAndName + " has the keywords: "
+ keywords.getValue() );
}
if ( subject.hasValue() )
{
System.out.println( aFilePathAndName + "'s subject is "
+ subject.getValue() );
}
if ( description.hasValue() )
{
System.out.println( aFilePathAndName + "'s description is "
+ description.getValue() );
}
if ( version.hasValue() )
{
System.out.println( aFilePathAndName + " is version "
+ version.getValue() );
}
if ( creator.hasValue() )
{
System.out.println( aFilePathAndName + "'s creator is "
+ creator.getValue() );
}
}
catch (InvalidFormatException invalidFormatEx) // checked exception
{
System.err.println( "Invalid format (looking for Office 2007 format):"
+ invalidFormatEx.getMessage() );
}
}

/**
* Print a separator to stdout to delineate example file being run.
*
* @param aSeparatorTitle Title to place in separator header.
*/
public static void printSeparatorHeader(final String aSeparatorTitle)
{
System.out.println(
"-------------------------------------------------------------------");
System.out.println(
"-- " + aSeparatorTitle );
System.out.println(
"-------------------------------------------------------------------");
}

/**
* Run test showing how to extract Office 2007 format properties from a
* PowerPoint presentation, a Word document, and an Excel spreadsheet using
* the OpenXML4J library (currently in Alpha).
*
* @param aCommandLineArgs Command-line arguments.
*/
public static void main(final String[] aCommandLineArgs)
{
String officeFilePathAndName;
OpenXML4JExample me = new OpenXML4JExample();

// Example Microsoft Office 2007 PowerPoint presentation
officeFilePathAndName = "C:\\sample2007\\marx-poi.pptx";
printSeparatorHeader("PowerPoint Example: " + officeFilePathAndName);
me.readOfficeDocumentProperties(officeFilePathAndName);

// Example Microsoft Office 2007 Word document
officeFilePathAndName = "C:\\sample2007\\marx-poi.docx";
printSeparatorHeader("Word Example: " + officeFilePathAndName);
me.readOfficeDocumentProperties(officeFilePathAndName);

// Example Microsoft Office 2007 Excel spreadsheet
officeFilePathAndName = "C:\\sample2007\\2008Conferences.xlsx";
printSeparatorHeader("Excel Example: " + officeFilePathAndName);
me.readOfficeDocumentProperties(officeFilePathAndName);
}
}


As you can see, the OpenXML4J API for accessing properties is extremely straightforward. I did not get all properties, but did show examples of obtaining the more important Office properties. The code sample also demonstrates the need to catch the checked exception org.openxml4j.exceptions.InvalidFormatException.

The next screen shot (click on it to see larger image) shows the output from running the code above against the three Office 2007-compatible (Office Open XML format) documents described earlier in this blog.



This output demonstrates some useful things to know when using OpenXML4J. For example, I needed to place a log4j JAR file and a dom4j JAR file on the classpath when using the OpenXML4J library. In this case, I knew that the Spring framework download with dependencies contains these libraries and so I simply used those versions. The screen shot also shows that I used the Alpha version of OpenXML4J released on 21 January 2008 (openxml4j-bin-alpha-080121.jar).

There is, of course, much more to OpenXML4J than the extraction of properties from Office 2007 files and we can expect much more to come as well as it moves to production support and quality.