Thursday, 12 January 2012

Approaches to XML - Part 4 - XMLBeans

If you remember from my previous blogs, I’m covering different approaches to parsing XML messages using the outrageously corny scenario of Pete’s Perfect Pizza, the pizza company with big ideas. In this story, you are an employee of Pete’s and have been asked to implement a system for sending orders from the front desk to the kitchen and you came up with the idea of using XML. You’ve just got your SAX Parser working, but Pete’s going global, opening kitchens around the world taking orders using the Internet.

But, hang on a minute... didn’t I say this in my last blog? Déjà vu? Today’s blog is the alternative reality version of my JAXB blog as the scenario remains the same, but the solution changes. Instead of demonstrating JAXB, I’ll be investigating XMLBeans.

So, Pete’s hired some consultants who’ve come up with a plan for extending your cosy XML message and they’ve specified it using a schema. They’ve also enhanced your message by adding in one of there own customer schemas. The result is that the following XSD files land in your inbox and you need to get busy...

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSpy v2011 sp1 (http://www.altova.com) by Roger Hughes (Marin Solutions Ltd) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ppp="http://www.petesperfectpizza.com" xmlns:cust="http://customer.dets" targetNamespace="http://www.petesperfectpizza.com" elementFormDefault="qualified" attributeFormDefault="unqualified" version="1.00">
 <!-- Import the Namespaces required -->
 <xs:import namespace="http://customer.dets" schemaLocation="customer.xsd"/>
 <!-- The Root Node -->
 <xs:element name="PizzaOrder">
  <xs:annotation>
   <xs:documentation>A wrapper around the customer and the pizza order</xs:documentation>
  </xs:annotation>
  <xs:complexType>
   <xs:sequence>
    <xs:element name="orderID" type="ppp:CorrelationIdentifierType"/>
    <xs:element name="date" type="ppp:DateType"/>
    <xs:element name="time" type="ppp:TimeType"/>
    <xs:element name="Customer" type="cust:CustomerType"/>
    <xs:element ref="ppp:pizzas"/>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
 <!-- The Pizza Order-->
 <xs:element name="pizzas">
  <xs:annotation>
   <xs:documentation>This is a list of pizzas ordered by the customer</xs:documentation>
  </xs:annotation>
  <xs:complexType>
   <xs:sequence>
    <xs:element name="pizza" type="ppp:PizzaType"  minOccurs="1" maxOccurs="unbounded"/>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
 <xs:complexType name="PizzaType">
  <xs:sequence>
   <xs:element name="name" type="ppp:PizzaNameType">
    <xs:annotation>
     <xs:documentation>The type of pizza on the menu</xs:documentation>
    </xs:annotation>
   </xs:element>
   <xs:element name="base" type="ppp:BaseType">
    <xs:annotation>
     <xs:documentation>type of base</xs:documentation>
    </xs:annotation>
   </xs:element>
   <xs:element name="quantity" type="ppp:QuantityType">
    <xs:annotation>
     <xs:documentation>quantity of pizzas</xs:documentation>
    </xs:annotation>
   </xs:element>
  </xs:sequence>
 </xs:complexType>
 <xs:simpleType name="PizzaNameType">
  <xs:restriction base="xs:token">
   <xs:enumeration value="Margherita">
    <xs:annotation>
     <xs:documentation>Plain and Simple</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="Marinara">
    <xs:annotation>
     <xs:documentation>Garlic Pizza...</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="Prosciutto e Funghi">
    <xs:annotation>
     <xs:documentation>Ham and Musheroom</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="Capricciosa">
    <xs:annotation>
     <xs:documentation>with an egg</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="BaseType">
  <xs:restriction base="xs:token">
   <xs:enumeration value="thin">
    <xs:annotation>
     <xs:documentation>thin base traditional</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="thick">
    <xs:annotation>
     <xs:documentation>Thick base</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="QuantityType">
  <xs:restriction base="xs:nonNegativeInteger"/>
 </xs:simpleType>
 <xs:simpleType name="CorrelationIdentifierType">
  <xs:restriction base="xs:token">
   <xs:maxLength value="44"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="DateType">
  <xs:annotation>
   <xs:documentation>The date is in the Common Era (minus sign in years is not permitted)</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:date">
   <xs:pattern value="\d{4}-\d{2}-\d{2}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="TimeType">
  <xs:annotation>
   <xs:documentation>The time zone although not included UTC is implied</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:time">
   <xs:pattern value="\d{2}:\d{2}:\d{2}(\.\d+)?"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSpy v2011 sp1 (http://www.altova.com) by Roger Hughes (Marin Solutions Ltd) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cust="http://customer.dets" targetNamespace="http://customer.dets" elementFormDefault="qualified" attributeFormDefault="unqualified">
 <xs:element name="Customer" type="cust:CustomerType">
  <xs:annotation>
   <xs:documentation>Generic Customer Definition</xs:documentation>
  </xs:annotation>
 </xs:element>
 <xs:complexType name="CustomerType">
  <xs:sequence>
   <xs:element name="name" type="cust:NameType"/>
   <xs:element name="phone" type="cust:PhoneNumberType"/>
   <xs:element name="address" type="cust:AddressType"/>
  </xs:sequence>
 </xs:complexType>
 <xs:complexType name="NameType">
  <xs:sequence>
   <xs:element name="firstName" type="cust:FirstNameType"/>
   <xs:element name="lastName" type="cust:LastNameType"/>
  </xs:sequence>
 </xs:complexType>
 <xs:simpleType name="FirstNameType">
  <xs:annotation>
   <xs:documentation>The Customer's first name</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:token">
   <xs:maxLength value="16"/>
   <xs:pattern value=".{1,16}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="LastNameType">
  <xs:annotation>
   <xs:documentation>The Customer's surname</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:token">
   <xs:pattern value=".{1,48}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:complexType name="AddressType">
  <xs:sequence>
   <xs:element name="houseNumber" type="cust:HouseNumberType"/>
   <xs:element name="street" type="cust:AddressLineType"/>
   <xs:element name="town" type="cust:AddressLineType" minOccurs="0"/>
   <xs:element name="area" type="cust:AddressLineType" minOccurs="0"/>
   <xs:element name="postCode" type="cust:PostCodeType"/>
  </xs:sequence>
 </xs:complexType>
 <xs:simpleType name="HouseNumberType">
  <xs:annotation>
   <xs:documentation>The house number</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:nonNegativeInteger"/>
 </xs:simpleType>
 <xs:simpleType name="AddressLineType">
  <xs:annotation>
   <xs:documentation>A line of an address</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:token">
   <xs:pattern value=".{1,100}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="PhoneNumberType">
  <xs:restriction base="xs:token">
   <xs:maxLength value="18"/>
   <xs:pattern value=".{1,18}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="PostCodeType">
  <xs:restriction base="xs:token">
   <xs:maxLength value="10"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

You realise that with this level of complexity, you’ll be messing around with SAX for a long time, and you could also make a few mistakes. There must be a better way right? After-all XML has been around for some time, so there most be a few frameworks around that could be useful. After a bit more Googling you come across XMLBeans and realise that there are...

XMLBeans uses a special compiler to convert an XML schema into a bunch of related Java classes that define the types required to access the XML elements, attributes and other content in a type-safe way. This blog isn’t a tutorial covering the ins and outs of XMLBeans, that can be found here, from Apache, except to say the the key idea for parsing, or unmarshalling, XML is that you compile your Java classes using XMLBeans and then use those classes in your application.

In using any XML schema to Java class compiler, the neatest approach is to put all your schemas and the compiler in a separate JAR file. You can mix them in with your application’s source code, but that usually clouds the code base making maintenance more difficult. In creating a XMLBeans JAR file, you may come up with a POM file that looks something like this:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.captaindebug</groupId>
    <artifactId>xml-tips-xmlbeans</artifactId>
    <packaging>jar</packaging>
    <version>1.0-SNAPSHOT</version>
    <name>XML Beans for Pete's Perfect Pizza</name>
    <dependencies>
  <dependency>
      <groupId>org.apache.xmlbeans</groupId>
      <artifactId>xmlbeans</artifactId>
      <version>2.4.0</version>
  </dependency>
    </dependencies>
    <build>
          <plugins>
            <plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>xmlbeans-maven-plugin</artifactId>
    <executions>
       <execution>
          <goals>
       <goal>xmlbeans</goal>
          </goals>
       </execution>
    </executions>
    <inherited>true</inherited>
    <configuration>
     <schemaDirectory>src/main/resources</schemaDirectory>
    </configuration>
   </plugin>
   <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>2.3.2</version>
    <configuration>
     <source>1.6</source>
     <target>1.6</target> 
    </configuration>
   </plugin>
        </plugins>
    </build>

</project>

...which is very straight forward. So, getting back to Pete’s Perfect Pizza, you’ve created your XMLBeans JAR file and all that’s left to do is to explore how it works, as demonstrated in the JUnit tests below:

public class PizzaXmlBeansTest {

 
private PizzaOrderDocument instance;

 
@Test
 
public void testLoadPizzaOrderXml() throws IOException, XmlException {

   
String xml = loadResource("/pizza-order1.xml");

    instance = PizzaOrderDocument.Factory.parse
(xml);

    PizzaOrder order = instance.getPizzaOrder
();

    String orderId = order.getOrderID
();
    assertEquals
("123w3454r5", orderId);

   
// Check the customer details...
   
CustomerType customerType = order.getCustomer();

    NameType nameType = customerType.getName
();
    String firstName = nameType.getFirstName
();
    assertEquals
("John", firstName);
    String lastName = nameType.getLastName
();
    assertEquals
("Miggins", lastName);

    AddressType address = customerType.getAddress
();
    assertEquals
(new BigInteger("15"), address.getHouseNumber());
    assertEquals
("Credability Street", address.getStreet());
    assertEquals
("Any Town", address.getTown());
    assertEquals
("Any Where", address.getArea());
    assertEquals
("AW12 3WS", address.getPostCode());

    Pizzas pizzas = order.getPizzas
();
    PizzaType
[] pizzasOrdered = pizzas.getPizzaArray();

    assertEquals
(3, pizzasOrdered.length);

   
// Check the pizza order...
   
for (PizzaType pizza : pizzasOrdered) {

     
PizzaNameType.Enum pizzaName = pizza.getName();
     
if ((PizzaNameType.CAPRICCIOSA == pizzaName) || (PizzaNameType.MARINARA == pizzaName)) {
       
assertEquals(BaseType.THICK, pizza.getBase());
        assertEquals
(new BigInteger("1"), pizza.getQuantity());
     
} else if (PizzaNameType.PROSCIUTTO_E_FUNGHI == pizzaName) {
       
assertEquals(BaseType.THIN, pizza.getBase());
        assertEquals
(new BigInteger("2"), pizza.getQuantity());
     
} else {
       
fail("Whoops, can't find pizza type");
     
}
    }
  }

 
private String loadResource(String filename) throws IOException {

   
InputStream is = getClass().getResourceAsStream(filename);
   
if (is == null) {
     
throw new IOException("Can't find the file: " + filename);
   
}

   
return toString(is);
 
}

 
private String toString(InputStream is) throws IOException {
   
ByteArrayOutputStream bos = new ByteArrayOutputStream();
    copyStreams
(is, bos);

   
return bos.toString();
 
}

 
private void copyStreams(InputStream is, OutputStream os) throws IOException {
   
byte[] buf = new byte[1024];

   
int c;
   
while ((c = is.read(buf, 0, 1024)) != -1) {
     
os.write(buf, 0, c);
      os.flush
();
   
}
  }

}

The code above may look long and complex, but it really only comprises of three steps: firstly, turn the test file into a suitable type such as a String or InputStream (XMLBeans can handle several different input types). Then use the nested Factory class to process your XML source turning into a document object. Finally, use the returned document object to test that your results are what you'd expected them to be (this is by far the largest step). Once you’re happy with the largely boilerplate usage of XMLBeans, you add it into your Pete's Perfect Pizza kitchen XML parser code and distribute it around the world to Pete’s many pizza kitchens.

One of the strong points of using a framework like XMLBeans is that if there’s ever a change to the schema, then all that’s required to incorporate those changes is to recompile, fixing up your client code accordingly. This may seem a bit of a headache, but it’s a much smaller headache than trying re-work a SAX parser. On the downside, XMLBeans has been criticised for being slow, but I’ve never had too many problems. It will theoretically use more memory that SAX - this may or may not be true - it does build sets of classes, but then again so do some SAX ContentHandler derived classes.

Finally, it should be noted that there hasn’t been a release of XMLBeans since 2009, which may or may not be a good thing depending upon your viewpoint, though I should emphasized that this code certainly isn’t redundant as, to my certain knowledge, it is still widely used on a number of large scale projects.


Other blogs in this series...
  1. XML is not a String
  2. What about Sax
  3. JAXB
  4. XMLBeans

The source code is available from GitHub at:

git://github.com/roghughe/captaindebug.git


2 comments:

gpol said...

I just wanted to tell you that your blog is perfect. It has a tutorial flavor but not for beginners. I am a software architect myself and very often direct my devs to your blog.
Cheers!

Roger Hughes said...

Thanks for the compliment.