Danger, Will Robinson: That Social Thing

In the comments to the last post, Julian wrote:

Like you and others, I’m extremely uncomfortable with the tendency of the FOAF insiders to publish details in their FOAF files about people they “know”. At the most trivial level this is full, plain text email addresses. Unfortunately there’s very little support either in the standard or in the real life data for the author of a particular FOAF file to say which elements they are responsible for and which elements are hearsay that they believe.

To address concerns, Libby, a FOAF co-creator, wrote:

- all (or most) RDF databases store where they got the data from – this was something immediately recognised as essential, certainly within FOAF; it’s vital from a practical point of view to trace problems, update data etc.
- RDF is designed to say anything about anything: that’s what it’s there for, and that’s what FOAF uses it for. Issues about privacy and so on are very similar to privacy issues with webpages: don’t give out information that’s private….it’s a social thing, not really a technology thing.

Dan Brickley, another FOAF co-creator, wrote:

One of the reasons it has taken us 3+ years to get FOAF to where it is now is that we tried to build FOAF apps on earlier, less mature RDF tools, and in doing so learned about some of the ways the social and technical interact around RDF. On the Web, and on the Semantic Web, keeping track of who-said-what is critical. You can’t sensibly deal with aggregates of RDF/XML without doing so.

FOAF isn’t the first RDF vocabulary, nor is it the one most used — I would say RSS 1.0 has that honor. However, FOAF does drive out the social issues associated with RDF vocabularies, and their open extensibility and ease of access, because of the type of data organized — personal information and information about associations with other people.

At issue is three things:

1. Person is the resource being described, but Person is also the object of the property ‘knows’. The use of Person in both regards means that information about a person can enter from two directions — directly from one’s FOAF file, and indirectly from additional information entered by another individual.

2. The source of the sub-graph, which is what each individual FOAF file is, should be maintained, but isn’t always. Even if it is, though, there’s no mechanism in place to remove data from a FOAF database if it’s true.

3. Even though the source of a sub-graph is maintained, general queries most likely won’t directly access it, being more interested in properties and their values then source of the data.

Let’s explore these with a scenario:

As demonstration, I picked three people who I felt would most likely not be interested in having a FOAF profile, and would most likely, at a guess, dislike the concepts behind one: Jonathon Delacour, Loren Webster, and Joseph Duemer.

(I apologize most sincerely to all three gentlemen if I was incorrect in assessment and for using them as demonstration subjects.)

Since all three don’t have a FOAF file, and I want to add them as associations to my FOAF file, I thought I would extend the information I record about them in my file. In addition to their encrypted email address, name, weblog URL, I’m also going to record their Myers Brigg test results and their interests,

Let’s start with Jonathon. First of all, he once wrote about Myers-Briggs at his weblog, so I know he’s an INFJ. His weblog address is http://weblog.delacour.net, which is easy, but his interests are going to require a little interpretation.

Jonathon writes about Japan and Japanese culture extensively, so we Japan is one interest. He also recently wrote a lovely note about the beauty of Japanese Kimonos, so we’ll put kimonos as another interest.

He talks about World War II and has written about firebombing. Then there’s the essay destined to go down into weblogging legend, where Jonathon writes about an event apres lovemaking that may or may not be factual, but is true — so we’ll add sex as an interest, and for good measure, we’ll throw in fiction, too.

He’s talked a lot about Bush, so we’ll add Bush as an interest. And we wouldn’t be complete without also throwing in

So far, Jonathon’s interests are:

George Bush
Tim Tams

And since all of these are pulled from publicly accessible pages, I’m not exposing anything that hasn’t been exposed publicly before — in a different context.

Now, Loren’s turn.

Without context, these statements are meaningless. At worst, they’re misleading, at best, they’re meaningless. Context. Context. Context. I know this is a word that Tim Berners-Lee dislikes and thinks is overused and misunderstood, but I think I have a pretty good idea of what it is and I bet other people do, too.

Context. Context. Context. Without context, statements modeled in RDF are nothing more than bits of data. Nice, but not the Semantic Web. Without context, there is no meaning, there is no trust, and there is no semantics.

Context. Get used to this word because by throwing out context as a consideration in RDF and RDF/XML, you’ve thrown out the best part of the model.

This entry was posted in Technology. Bookmark the permalink.

Comments are closed.