> The XQuery Processor
In the XQuery specification, XQuery is described as a language capable of expressing queries on XML data. This makes XQuery the perfect choice for a data processing language on the Web because XML is widely used there. However, there are also some other data formats present on the Web. For example, JSON (JavaScript Object Notation) is the most common data format for applications which are written in JavaScript.
Zorba implements a set of functions that opens XQuery developers the door to processing other data formats (like JSon). In this document, we describe these functions in detail.
JSon is a lightweight hierarchical data-interchange format. Like XML, it is easy for humans to read and write. Moreover, it is easy for machines to parse and generate.
In order to process JSON with XQuery, Zorba implements a mapping between JSON and XML that was proposed by John Snelson in his article Parsing JSON into XQuery.
In this article, he describes the following recursive mapping declarations.
JSON | type(JSON) | toXML(JSON) |
JSON | N/A | <json type="type(JSON)">toXML(JSON)</json> |
{ "key1": value1, "key2": value2 } | object | <pair name="key1" type="type(value1)">toXML(value1)</pair><pair name="key2" type="type(value2)">toXML(value2)</pair> |
[ value1, value2 ] | array | <item type="type(value1)">toXML(value1) |
</item>
"value"
string
value
number
number
number
true / false
boolean
true / false
null
null
empty
Zorba implements this mapping in two functions: the parse function is used for parsing a sequence of JSON strings into a sequence of elements, the serialize function implements the reverse process, i.e. serializing a sequence of elements into a sequence of valid JSON strings. In the following, we describe those functions and give some examples.
The parse and serialize functions are available in the JSON module with URI "http://www.zorba-xquery.com/zorba/json-functions". In order to use this functionality, you have to import this module in the prolog of your XQuery module as follows:
import module namespace zorba-json = "http://www.zorba-xquery.com/zorba/json-functions";
The parse function can be used for parsing a sequence of valid JSON strings into a sequence of XDM elements.
declare function zorba-json:parse($text as xs:string*) as xs:element*
The function raises the API0060 error if any of the strings passed as parameter are not a valid JSON (see Appendix A: Error codes).
The following XQuery (taken from John Snelson's article mentioned above) demonstrates the usage of the parse function.
import module namespace json = "http://www.zorba-xquery.com/zorba/json-functions"; json:parse(('{ "firstName": "John", "lastName": "Smith", "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": 10021 }, "phoneNumbers": [ "212 732-1234", "646 123-4567" ] }', '{ "firstName": "John", "state": null, "bool": true , "numbers": [1,2,3] , "address": { "streetAddress": "21 2nd Street", "state": null, "postalCode": 10021 , "literals": [true,false,null], "delivery": { "streetAddress": "StreetName", "city": "CityName", "state": "StateName", }} , "strings": [ "one", "two", "three", "four" ] }'))
For example, executing this query and serializing its result to XML yields the following:
<json type="object"> <pair name="firstName" type="string">John</pair> <pair name="lastName" type="string">Smith</pair> <pair name="address" type="object"> <pair name="streetAddress" type="string">21 2nd Street</pair> <pair name="city" type="string">New York</pair> <pair name="state" type="string">NY</pair> <pair name="postalCode" type="number">10021</pair> </pair> <pair name="phoneNumbers" type="array"> <item type="string">212 732-1234</item> <item type="string">646 123-4567</item> </pair> </json> <json type="object"> <pair name="firstName" type="string">John</pair> <pair name="state" type="null"/> <pair name="bool" type="boolean">true</pair> <pair name="numbers" type="array"> <item type="number">1</item> <item type="number">2</item> <item type="number">3</item> </pair> <pair name="address" type="object"> <pair name="streetAddress" type="string">21 2nd Street</pair> <pair name="state" type="null"/> <pair name="postalCode" type="number">10021</pair> <pair name="literals" type="array"> <item type="boolean">true</item> <item type="boolean">false</item> <item type="null"/> </pair> <pair name="delivery" type="object"> <pair name="streetAddress" type="string">StreetName</pair> <pair name="city" type="string">CityName</pair> <pair name="state" type="string">StateName</pair> </pair> </pair> <pair name="strings" type="array"> <item type="string">one</item> <item type="string">two</item> <item type="string">three</item> <item type="string">four</item> </pair> </json>
The serialize function takes a sequence of elements as parameter and transforms each element into a valid JSON string according to the mapping depicted above. The function delaration is as follows:
declare function zorba-json:serialize($xml as xs:element*) as xs:string*
There are two error scenarios:
(1) If the passed elements do not have a valid JSON structure, the API0061 error is raised (2) if the passed parameter is not an element, the API0062 error is raised (also see \ref appendix_converters).
In the following, we demonstrate the use of the JSON serialize function.
import module namespace json = "http://www.zorba-xquery.com/zorba/json-functions"; declare variable $str as xs:string := '{"firstName": "John","state": null,"bool": true,"numbers": [1,2,3],"address": {"streetAddress": "21 2nd Street","state": null,"postalCode": 10021,"literals": [true,false,null],"delivery": {"streetAddress": "StreetName","city": "CityName","state": "StateName"}},"strings": ["one","two","three","four"]}'; json:serialize(json:parse($str)) eq $str
The JSON serialization process is also implemented in the serializer component of Zorba exposed through external C/C++ API. This is similar to using any of the serialization methods defined in the "XSLT 2.0 and XQuery 1.0 Serialization" specification at http://www.w3.org/TR/xslt-xquery-serialization/. This method can be triggered by the Zorba command line utility or by any of the Zorba programming APIs.
In the following three examples, we use the Zorba command line utility to serialize the result of a query and generate JSON by using the --serialization-parameter,-z option.
In contrast to the serialize function described above, the JSON serializer has to be passed a single element node adhering to the structure presented above (see 2 JSON). This is because a valid JSON document is required to have a single root object. If a query returns a sequence with more then one element, the API0066 error is raised (see Appendix A: Error codes and Example 3 below).
Example 1:
zorba -q "<json type='object'><pair name='firstName' type='string'>John</pair></json>" -z method=json
Result of Example 1:
{"firstName": "John"}
Example 2:
zorba -q "<ul>1</ul>" -z method=json
Output of Example 2:
[API0061] Could not serialize element with string representation {1}. Error: {This is not a Json element.}
Example 3
zorba -q "(<json type='object'><pair name='firstName' type='string'>John</pair></json>, 2)" -z method=json
Output of Example 3:
{"firstName": "John"}
[API0066] Cannot serialize a sequence if 'json' or 'jsonml' method was selected.
JsonML (JSON Markup Language) is an application of the JSON (JavaScript Object Notation) format. The purpose of JsonML is to provide a compact format for transporting XML-based markup as JSON. In contrast to the JSON mapping depicted above (see 2 JSON), JsonML allows a lossless conversion back and forth.
Zorba implements the JsonML structure defined at http://www.ibm.com/developerworks/library/x-jsonml/#N10138. Analogoulsy to the JSON conversion, this functionality is implemented in a parse and serialize function, respectively.
More details about JsonML can be found in the following article: Get to know JsonML.
The parse and serialize functions are available in the JsonML module with URI "http://www.zorba-xquery.com/zorba/json-ml-functions".
import module namespace zorba-json-ml = "http://www.zorba-xquery.com/zorba/json-ml-functions";
The parse function can be used to parse a sequence of valid JsonML strings to XDM elements. It is declared as follows
declare function zorba-json-ml:parse($text as xs:string*) as xs:element*
The API0063 error is raised if one of the strings that are passed as parameters are not valid JsonML (also see Appendix A: Error codes).
import module namespace jsonml = "http://www.zorba-xquery.com/zorba/json-ml-functions"; jsonml:parse(('[ "ul", [ "li", true], [ "li", {"href":"driving.html", "title":"Driving"}, "Second item"], [ "li", null], [ "li", -14] ]', '["table",{"class":"maintable"}, ["tr",{"class":"odd"}, ["th",{},"Situation"], ["th","Result"]], ["tr", {"class":"even"}, ["td", ["a", {"href":"driving.html", "title":"Driving"},"Driving"]], ["td", "Busy"] ] ]'))
<ul> <li>true</li> <li href="driving.html" title="Driving">Second item</li> <li/> <li>-1.4E+13</li> </ul> <table class="maintable"> <tr class="odd"> <th>Situation</th> <th>Result</th> </tr> <tr class="even"> <td><a href="driving.html" title="Driving">Driving</a></td> <td>Busy</td> </tr> </table>
The serialize function takes a sequence of elements and transforms each of them into a sequence of JsonML strings.
declare function zorba-json-ml:serialize($xml as xs:element*) as xs:string*
If one of the passed elements does not have a valid JsonML structure, the API0064 error is raised. If the passed parameter is not an element, the API0065 error is raised.
import module namespace jsonml = "http://www.zorba-xquery.com/zorba/json-ml-functions"; jsonml:serialize((<ul><li>true</li><li href="driving.html" title="Driving">Second item</li><li/><li>-1.4</li></ul> ,<table class="maintable"><tr class="odd"><th>Situation</th><th>Result</th></tr><tr class="even"><td><a href="driving.html" title="Driving">Driving</a></td><td>Busy</td></tr></table>))
["ul", ["li", "true"], ["li",{"href":"driving.html"}, {"title":"Driving"}, "Second item"], ["li"], ["li", "-1.4E+13"] ] ["table", {"class":"maintable"}, ["tr", {"class":"odd"}, ["th", "Situation"], ["th", "Result"] ], ["tr", {"class":"even"}, ["td", ["a", {"href":"driving.html"}, {"title":"Driving"}, "Driving"]], ["td", "Busy"] ] ]
Calling zorba-json-ml:serialize(zorba-json-ml:parse($some_string)) will not always produce $some_string.
This is caused by the fact that jsonml:serialize does dot know the exact type of the JSON value (true, false, null, numbers) and treats them all as strings.
Here are some possible cases where this can happen:
import module namespace jsonml = "http://www.zorba-xquery.com/zorba/json-ml-functions"; jsonml:serialize(jsonml:parse(('[ "ul", [ "li", true], [ "li", null], [ "li", -14e12] ]')))
will output the following result:
["ul", ["li", "true"], ["li"], ["li", "-1.4E+13"] ]
The JsonML serialization functionality is also implemented in the serializer component of Zorba.
In the following, we give some examples that demonstrate this using the zorba command line utility using the --serialization-parameter,-z option.
Please note that the result of query has to be a sequence with one element.
Here are some examples:
Note that is you pass a sequence of items, only the first item in the sequence will be processed and then an error is raised API0066 (see Appendix A: Error codes).
See example 3 below:
Example 1:
zorba -q "<ul>1</ul>" -z method=jsonml
Output of Example 1:
["ul", "1"]
Example 2
zorba -q "<?pi content?>" -z method=jsonml
Output of Example 2:
[API0064] Could not serialize element with string representation {content}. Error: {This is not a JsonML element.}
Example 3
zorba -q "(<ul>1</ul>, <?pi content?>)" -z method=jsonml
Output of Example 3:
["ul", "1"] [API0066] Cannot serialize a sequence if 'json' or 'jsonml' method was selected.
API0060 - API0060_CONV_JSON_PARSE - is raised if the string could not be parsed.
API0061 - API0061_CONV_JSON_SERIALIZE - is raised if the element could not be serialized.
API0062 - API0062_CONV_JSON_PARAM - is raised is the passed param is not an element.
API0063 - API0063_CONV_JSON_ML_PARSE - is raised if the string could not be parsed.
API0064 - API0064_CONV_JSON_ML_SERIALIZE - is raised if the element could not be serialized.
API0065 - API0065_CONV_JSON_ML_PARAM - is raised is the passed parameter is not an element.
API0066 - API0066_JSON_SEQUENCE_CANNOT_BE_SERIALIZED - Cannot serialize a sequence with more than one element if the 'json' or 'jsonml' method was selected.