RDF Query-o-Matic Light
I slaved away this afternoon, persevering in my work in spite of numerous obstacles (sunshine, cat on lap, languor) to bring you RDF Query-o-Matic Light -- the PHP-based RDFQL machine. A grueling six or so lines of code. I sit in exhaustion on my stool, fanning myself with old green bar computer paper.
Speaking of stools, that reminds me of another nusery rhyme associated with RDF.
Little Miss Muffet, sat on a tuffet,
Eating her curds and whey;
Along came a spider,
Who sat down beside her
And frightened Miss Muffet away.
Chances are, the stool referenced in this rhyme was a three legged one, similar to the milk stools still used today. Three is the perfect number of legs for a stool: just enough legs to provide stability, but without the need for the additional material for an extraneous fourth leg.
Returning to the subject of RDF, it, like the milk stool, is based on the principle that 'three' is the magic number -- in this case three pieces of information are all that's needed in order to fully define a single bit of knowledge. Less than three, then all you have is fact without context; more, and you're being redundant.
Of the three pieces of information, the first is the RDF subject. After all, when discussing a property such as name, it can belong to a dog, cat, book, plant, person, car, nation, or insect. To make finite an infinite universe, you must set boundaries, and that's what subject does for RDF.
The second piece of information is the predicate, more commonly thought of as the RDF 'property'. There are many facts about any individual subject; for instance, I have a sex, a height, a hair color, eye color, degree, relationships, and so on. To focus on that aspect of me that we're interested in at any one point in time, we need to specifically focus on one of my 'properties'.
If you look at the intersection of 'subject' and 'property', you'll find the final bit of information quietly waiting to be discovered -- the value of the property. X marks the spot.
I am me. I have a name (Shelley Powers). I have a height (close to six feet). I have an attitude (sweet tempered and quite easy going). Each of these bits of knowledge form a picture, and that picture is me.
All from RDF triples strung together in precise ways.
On to the new version of the RDF Query-o-Matic, the PHP-based Query-o-Matic Light. This version, like the JSP version can apply a valid RDFQL query against a valid RDF file, printing out a target value. However, there are some minor syntactic differences between the two.
The PHP classes that provide the functionality for Light (PHP XML rdql), include the file name as well as explicit namespace use within the query rather than as separate elements. For instance, the following query will access titles from all elements contained within my resume.rdf file -- a file with an experimental resume RDF vocabulary:
SELECT ?b
FROM <http://weblog.burningbird.net/resume.rdf>
WHERE (?a, <bbd:title>, ?b)
USING bbd for >http://www.burningbird.net/resume_schema#>
The first line is the same SELECT clause, as discussed in the last RDFQL posting, but this is followed by a FROM clause, which lists the RDF file's URL within angle brackets. Following is the WHERE clause containing the query, and again, this is no different than the JSP version, except that an alias is used instead of the full namespace. The namespace itself is listed in the last clause, delimited with the USING keyword.
Regardless of some syntactic differences, the query still returns the same result.
Taking the Light version of Query-o-matic out for a spin, I went looking for more complex queries, and found one in Phil's Comments RDF. Though deceptively simple looking, Phil's RDF file, in fact any RSS 1.0 RDF file, has one nasty little complication: containers.
An RDF container is an RDF object that groups related items together, usually with some implied processing as to order. An RDF container can group ordered items (SEQ), alternative items (ALT), or just a collection of unordered items (BAG). An RDF container is also a bit of a bugger when it comes to processing or generating RDF, one reason that they lack popularity.
However, the key to overcoming the difficulties associated with containers is the same as the one used with RDFQL queries -- work with it one step at a time.
Container elements can be accessed individually by knowing that each item appears as an object in a (subject, predicate, object) triple with a predicate of TYPE (http://www.w3.org/1999/02/22-rdf-syntax-ns#type using the namespace). To access all container elements using RDFQL, you would need to have a WHERE clause similar to:
(?subject, <rdf:type>, "http://purl.org/rss/1.0/item")
This will return all container elements within the RDF document for the JSP version of Query-o-Matic, but not the Light version. The PHP version doesn't allow for literals (the "http://purl.org/rss/1.0/item" value) directly within the query triple. Instead, you use a filter, designated by the keyword AND:
WHERE (?subject, <rdf:type>, ?object)
AND ?object=="http://purl.org/rss/1.0/item"
This triple query filters the elements returned, giving us a target set of subjects that are equal to all of the container elements in the document. With Phil's comments RDF/RSS file, this is all the comments.
Once we have the container elements, the subject values are then are passed into the next triple query, to access the DESCRIPTION property for each (the description holds the actual comment in RDF/RSS Comments). The value of the DESCRIPTION predicate is our target value, which gets printed out.
Pulling this all together, the query to access all of the actual comment text in the RDF document is:
SELECT ?desc
FROM <http://philringnalda.com/comments.rdf>
WHERE (?subject, <rdf:type>, ?object),
(?subject, <rss:description>, ?desc)
AND ?object=="http://purl.org/rss/1.0/item"
USING rdf for <http://www.w3.org/1999/02/22-rdf-syntax-ns#>,
rss for <http://purl.org/rss/1.0/>
The mapped values -- the subjects -- are highlighted. The subjects found in the first triple query are passed as subjects to the next.
Check out the results.
I'm actually not fond of container elements myself, precisely because there is processing semantics integrated into the element -- sequence is assumed to be an ordered list of items, while a bag is not. I would rather provide the information necessary to order elements -- such as date or some other characteristic -- and then let the tool creators decide how they want the elements ordered.
Regardless, the trick to working with container elements is to use the TYPE predicate to discover the container elements, pull the subject associated with each, and then use these with relatively standard RDFQL for the rest of the query.
You can use both the JSP-based Query-o-matic and the PHP-based Query-o-Matic Light to try out different queries on whatever valid RDF documents you know of. Documentation for the RDFQL syntax used with the JSP based version can be found here, and the RDFQL syntax for the Light version can be found here. Remember that though there are syntactic differences between the two, the actual RDFQL used in the WHERE clause is logically the same -- one or more chained triples, with the results of the first triple being passed to the second and so on.
Now that I have my query engines and can test my RDFQL, the next step is to pull these queries into an actual application, covered in the next of these essays into RDF and RDFQL.
To try the JSP Query-o-matic yourself, download and install Jena into your own environment. The actual o-matic JSP page can be downloaded here.
To try out o-Matic light, download and install the PHP XML classes. The PHP I used can be downloaded here.
Remember, these are for fun. So, have fun.
Posted by Bb at October 02, 2002 09:00 PM