Tuesday, 24 January 2012

Approaches to XML - A Review

A couple of weeks ago I wrote a few articles about XML highlighting several possible approaches to parsing it using a couple of different situations. The blogs in the series were:

  1. XML is not a String
  2. What about Sax
  3. JAXB
  4. XMLBeans

The story I used was Pete’s Perfect Pizza about a small high street pizza company that grows into an international business. The idea was that the fictitious programmer of the piece, our hero, is an absolute beginner and jumps straight in and choosing XML without considering any other design options.


The points I wanted to make using the articles were:
  1. XML is not a string, it’s a document object model that can be serialized into, and therefore represented by a string.
  2. That the same XML document could be serialized into a string in different ways without changing its meaning.
  3. SAX is pretty good in some scenarios, for example where the documents are not particularly complex, or where you have a high through-put of large messages and constructing a DOM for each message would simply use up all the JVM’s memory.
  4. That there are very useful and rich frameworks available such as JAXB and XMLBeans that do all the hard work of parsing XML for you.
  5. That your project’s approach to XML can depend upon its scale and circumstances.

These blogs that demonstrated a few XML concepts using a simplistic scenario, but in reality, shouldn't our hero have asked some pretty far reaching and fundamental design questions? For example:

How many pizza’s will Pete be selling, or in more technical terms, what’s the message rate between the front desk and the kitchen?

Is this a ‘push’ or a ‘pull’ system? i.e. does the front desk send messages to the kitchen or does the kitchen ask for them?

Do messages need to be queued? If ‘push’ then perhaps a queue at the kitchen end, if ‘pull’ then perhaps a queue at the front desk would be better.

Is XML the best choice in transporting this type of data? Given the sample document structure below, then probably not.

<?xml version="1.0" encoding="UTF-8"?>
<pizza>
    <name>Capricciosa</name>
    <base>thin</base>
    <quantity>2</quantity>
</pizza>

If you look at the simple structure of the XML, you may conclude that this is a data structure rather than a document structure, which may lead you to think that perhaps JSON would have been a better choice, but if the hero of the piece didn’t choose XML then the blog would have not been about XML...

What’s the best transport mechanism? The original high street scenario demonstrated a small scale, very cosy, in-house type project, whilst the international system tried to demonstrate a large scale system, so how should you connect the pieces? Simple RMI? Use EJBs or a JMS for queueing? A webservice?

This blog isn’t about answers to all these questions, but just to quickly demonstrate that they really should be asked before undertaking even a simple project like Pete’s Perfect Pizza. Finally, remember that this was a ‘green field’ project, a luxury which most of us don’t have that often, so our decisions and the questions we can ask are usually constrained by the code and systems that already exist.

No comments: