Atom Processing with Zorba
The Atom Syndication Format (RFC 4287) is one of the most popular XML formats to aggregate XML data. Such an aggregate is called an Atom feed. Atom is heavily used by the industry for web services (e.g. Google) and there are numerous extensions of this specification available. The aim of this tutorial is to present how easy an Atom feed can be processed with XQuery in order to send a message on Twitter for each new Atom entry in this feed. In order to do this, the queries shown in this tutorial use Zorba's implementation of EXPath http-client.
Twitter messages often contain a title, a link (in form of a "tiny url"), and some HashTags. The figure below depicts how each of these components could be extracted from an Atom entry using XPath and how a resulting Twitter post could look like.
For instance, we would like to build and send a message on Twitter from the following Atom entry:
<?xml version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom"> <title>Atom Processing with Zorba</title> <link href="http://zorba-xquery.com/blog/20100503" rel="self" /> <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id> <updated>2003-12-13T18:30:02Z</updated> <content type="text"> Atom feeds are very easy to process with Zorba. Zorba is an open source XQuery Processor written in C++. </content> </entry>In order to publish a Tweet about this entry on Twitter, we have to do the following:
- Create a "tiny url" pointing to the Atom entry returned by atom:link[@rel="self"]/@href using tinyurl.com.
- The HashTags are generated using the Yahoo Terms Extraction REST service. This web service returns terms (i.e. key words) out of a larger string content (called context). In order to retrieve the HashTags for our Tweet, we provide the content of the entry as context: /atom:content/text().
- send the message to Twitter.com.
- create-tinyurl($url as xs:string): Creates a tinyurl.
- yahoo-terms($context as xs:string): Extract terms from context.
- tweet($message as xs:string): Send a status update on Twitter.
Create the Tiny URL
Creating a tiny URL is simple.
It can be done by sending an HTTP GET request to http://tinyurl.com/api-create.php with the URL to shorten as parameter with name "url". As a result, the body of the response will contains the tiny URL. Using XQuery and the EXPath http-client library, this could be done as shown in the following code snippet.
import module namespace http-client = "http://expath.org/ns/http-client";
import schema namespace http = "http://expath.org/ns/http-client";
declare sequential function local:tinyurl-create($url as xs:string) as xs:string
{
http-client:send-request(validate{
<http:request
href="http://tinyurl.com/api-create.php?url={$url}"
method="GET" />
})[2]
};
local:tinyurl-create("http://www.zorba-xquery.com/")
If you run the code snippet above (You can cut/paste/execute it on Try Zorba), you will get the following result:
<?xml version="1.0" encoding="UTF-8"?> http://tinyurl.com/yc76gyh
Generating Hashtags
Getting the HashTags from Yahoo's Term Extraction Service is almost as easy. This can be achieved by sending a HTTP POST request to http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction and provide the content in an HTTP parameter named "context". Note that requests to this service must be authenticated with the "appid" HTTP parameter: $local:appid. Using XQuery, this could be done using the following lines of code.
import module namespace http-client = "http://expath.org/ns/http-client";
import schema namespace http = "http://expath.org/ns/http-client";
declare variable $local:appid as xs:string := "BbqvBFfV34FE8z5sUHzcpz7Oxydp06Cz4_T5XwnBOhNUqoBdyQWklLZM_Ot8KTq67ZapwsIbpTM-";
declare sequential function local:yahoo-terms($context as xs:string) as xs:string*
{
let $uri := concat("http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction?appid=", $local:appid, "&context=", encode-for-uri($context))
return
http-client:send-request(validate{
<http:request href="{$uri}"
method="post" />
})[2]//*:Result/text()
};
local:yahoo-terms("Atom processing with Zorba is very easy")
Executing the following query will return a sequence of two strings (You can cut/paste/execute it on Try Zorba):
<?xml version="1.0" encoding="UTF-8"?> zorba atom
Sending the Message on Twitter
Having the tiny url and the extracted terms, we can now post this information on Twitter. In order to do so some authentication is required. Twitter's API supports two authentication methods: HTTP basic (deprecated) and OAuth. For the sake of our example simplicity, we will use the HTTP basic authentication. However, it's also possible to use the XQuery OAuth library provided with Sausalito. Sausalito is an application server that leverages Zorba as its XQuery runtime. If you are interested in using OAuth with XQuery, please check out this Sausalito OAuth screencast.
In order to send a status update on Twitter, we need to send an HTTP POST request to http://api.twitter.com/1/statuses/update.xml. The HTTP parameter status will contain the message to be posted. Again, this can be done using a few lines of XQuery code.
import module namespace http-client = "http://expath.org/ns/http-client";
import schema namespace http = "http://expath.org/ns/http-client";
declare variable $local:username := "xqueryblog";
declare variable $local:password := "1qaz2w";
declare variable $local:statuses-update := "http://api.twitter.com/1/statuses/update.xml?status=";
declare sequential function local:tweet($message)
{
http-client:send-request(validate{
<http:request xmlns:http="http://expath.org/ns/http-client"
href="{concat($local:statuses-update,
encode-for-uri($message))}"
method="POST"
auth-method="Basic"
username="{$local:username}"
password="{$local:password}">
<http:header name="Expect" />
<http:body media-type="text/plain" method="text" />
</http:request>
})
};
local:tweet("Hello World!")[2]//*:text/text()
Running the following query (You can cut/paste/execute it on Try Zorba) will output the new status:
<?xml version="1.0" encoding="UTF-8"?> Hello World!
Putting It All Together
Now that we have shown how each of the three tasks (i.e. retrieving the tiny url, extracting terms with Yahoo, and posting a message to Twitter) can be implemented in simple functions, we just have to put them together. The prolog of the main query has to import the EXPath HTTP client module and schema as well as an Atom schema.
import module namespace http-client = "http://expath.org/ns/http-client"; import schema namespace http = "http://expath.org/ns/http-client"; import schema namespace atom = "http://www.w3.org/2005/Atom";Next, we need to setup (declare) the twitter username/password and yahoo appid to authenticate with Twitter and Yahoo, respectively.
(: Twitter parameters :) declare variable $local:username := "xqueryblog"; declare variable $local:password := "1qaz2w"; declare variable $local:statuses-update := "http://api.twitter.com/1/statuses/update.xml?status="; (: Yahoo parameters :) declare variable $local:appid := "BbqvBFfV34FE8z5sUHzcpz7Oxydp06Cz4_T5XwnBOhNUqoBdyQWklLZM_Ot8KTq67ZapwsIbpTM-";Then we paste the implementation of each of the three functions described previously:
declare sequential function local:tweet($message) { ... };
declare sequential function local:yahoo-terms($context as xs:string) as xs:string* { ... };
declare sequential function local:tinyurl-create($url as xs:string) as xs:string { ... };
The main query will be structured as following: (1) We extract the data out of the entry; (2) Create the tiny URL; (3) Get the terms from the Yahoo web service; and (4) send the message on Twitter.
(: (1) Extract information out of the Atom entry :)
let $entry := validate {
<entry xmlns="http://www.w3.org/2005/Atom">
<title>Atom Processing with Zorba</title>
<link href="http://zorba-xquery.com/blog/20100503" rel="self" />
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2003-12-13T18:30:02Z</updated>
<content type="text">Atom processing with Zorba is very easy</content>
</entry> }
let $title := $entry/atom:title/text()
let $url := encode-for-uri($entry/atom:link[@rel = "self"]/@href)
let $content := data($entry/atom:content/text())
(: (2) Create the tiny URL :)
let $tinyurl := local:tinyurl-create($url)
(: (3) Get the terms from the Yahoo web service :)
let $hashtags := string-join(for $term in local:yahoo-terms($content) return concat("#", $term), " ")
(: (4) Send the message on Twitter :)
return local:tweet(
concat($title, ": ", $tinyurl, " ", $hashtags)
)
You can run the complete query directly within Zorba's sandbox.Et Voilà! You are now able to process Atom elements and interact with web services. In such use-cases, Zorba makes your life easy. It provides an Atom XML schema, many libraries to communicate with the outside world: File System, E-mails, Excel, JSON, and many more.
Have Fun!