« January 2008 | Main | March 2008 »

February 14, 2008

Parsing JSON into XQuery

Doug Crockford stirred things up at XML 2007 with his comments on JSON and XML. He was wrong - mainly because he's only looking at structured data transfer, rather than anything else that XML is very good at like documents and semi-structured data. However he got me thinking about how easy it would be to process JSON in XQuery, which as Data Direct has shown is very good at manipulating all sorts of data formats.

I headed over to json.org to take a look at the work that had already been done on converting JSON to and from XML. Then I googled, and read around. I even had an email conversation with Dimitre Novatchev of FXSL fame, who's written a JSON parser entirely in XSLT!

All of the designs I looked at had at least one of the following problems:

  1. They could only convert a subset of JSON - things like map keys that weren't valid NCNames would cause problems.
  2. They didn't specify a 1-1 mapping - in other cases map keys were munged to be valid NCNames using a function with no inverse. This would mean that I wouldn't be able to convert back to JSON from it's XML representation.
  3. They lost the JSON type information, like whether a value was null or an empty string - which is also a way in which the mapping is not 1-1.

So given nothing fitted I chose to come up with my own mapping from JSON to XML - which is, after all, the JSON way. So here it is, in all it's simplicity:

JSONtype(JSON)toXML(JSON)
JSON N/A <json type="type(JSON)">toXML(JSON)</json>
{ "key1": value1, "key2": value2 }
object
<pair name="key1" type="type(value1)">toXML(value1)</pair>
<pair name="key2" type="type(value2)">toXML(value2)</pair>
  
[ value1, value2 ]
array
<item type="type(value1)">toXML(value1)</item>
<item type="type(value2)">toXML(value2)</item>
  
"value"
string
value
number
number
number
true / false
boolean
true / false
null
null empty

The table defines two abstract functions "type" and "toXML", which are recursively defined on the structure of the input JSON. The extension functions to parse and serialize JSON are called xqilla:parse-json() and xqilla:serialize-json(), and will available in the next release of XQilla.

xqilla:parse-json($xml as xs:string?) as element()?
xqilla:serialize-json($json-xml as element()?) as xs:string?

The translation produces a simple generic XML document - as an example, here's a query to parse a sample of JSON (swiped from wikipedia):

xqilla:parse-json('{
     "firstName": "John",
     "lastName": "Smith",
     "address": {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": 10021
     },
     "phoneNumbers": [
         "212 732-1234",
         "646 123-4567"
     ]
 }')

And here's its translation, the result of the query:

<json type='object'>
  <pair name='firstName' type='string'>John</pair>
  <pair name='lastName' type='string'>Smith</pair>
  <pair name='address' type='object'>
    <pair name='streetAddress' type='string'>21 2nd Street</pair>
    <pair name='city' type='string'>New York</pair>
    <pair name='state' type='string'>NY</pair>
    <pair name='postalCode' type='number'>10021</pair>
  </pair>
  <pair name='phoneNumbers' type='array'>
    <item type='string'>212 732-1234</item>
    <item type='string'>646 123-4567</item>
  </pair>
</json>

One of the nice things about the XML format for JSON is that it's easy to navigate. If I want to get the city from the JSON object above, the XQuery for it would be:

xqilla:parse-json("...")/pair[@name="address"]/pair[@name="city"]

Or if I want to get both the phone numbers I could use:

xqilla:parse-json("...")/pair[@name="phoneNumbers"]/item

Now I probably have to do a follow up post about all the cool things you can do in XQuery when you can parse JSON...

Posted by john at 05:21 PM | Comments (6)

February 12, 2008

DB XML with Python

At Oxford Geek Night 5 last week I bumped into James Gardner, apparently now known as a Pylons Guru, and a friend of mine from university.

I encouraged him to take a look at Berkeley DB XML's Python bindings, which he has done. His initial experience is written up in his latest blog post, which is worth reading if you're looking to get DB XML working from Python.

I said the same thing to James as I did to Greg Pollack when I met him at XML 2007 - why don't agile web frameworks like Pylons and Ruby on Rails support XML databases as a back end, rather than a SQL database? Surely XML databases are a much better fit for most data in web frameworks?

Posted by john at 04:23 PM | Comments (0)