Monthly Archives: October 2003

WhoWhatWhenWhere

I’ve been contemplating a commandline tool to access my new speedier database of harvested foaf info via the ‘RESTful webservices’ I provide to it. Sean Palmer made a neat, human-orientated query interface to it (FOAFQ) via a faux-Joseki interface (i.e. you append an RDF query, squish in this case, to a url and parse the result as RDF/XML).
I’ve also done a few others for my image annotation demonstrator, for example a service that provides data wbout an airport from a string (rather than from the IATA code, which you may not know, which Jim Ley and daml.org/data provide). Also I have one that provides images, names and mbox_sha1sums from a substring match. And so on….
I keep finding myself thinking things like, where a picture of so and so, what’s the wordnet definition of ‘fig’, what’s the IATA code for Paris airport, and so on. The codepiction image annotation tool web-services can be used to get this data, but the UI is all wrong. Matt Biddulph uses IRC for photo annotation, searching remove databases, because it’s fast, for people who are on IRC all the time. Jo Walsh has also been investigating text-based interfaces to RDF anbd OWL data with mudlondon. And of course Edd Dumbill’s famous foafbot provides an IRC interface to complex information from his harvested database of foaf data.
So, an obvious next step is to provide an interface to all the RDF data I can find around the place, and IRC seems like a good candidate (but simple commandline might be better when your personal calendar is available, or access to your addressbook).
The result was WhoWhatWhenWhere. It’s a simple bot based on the excellent PircBot api , which was really nice to use. It’s essentially a text UI to a number of ‘web services’ which use urls – not SOAP in this case (but it could use SOAP; it is just easier this way).
You can ask it about wordnet (.wn fig), airport data (.airport bristol, .iata BRS), foaf knows (.knows libby miller), pictures (.pic Damian Steer), codepiction paths (.paths Dan Brickley, Frank Sinatra) . More info is in the writeup.
I’m getting more and more interested in how it might be used, for example if there’s very many of these services, how you pick between them, how you handle the UI. And there’s already very many services to choose from, e.g on the daml data page.
The bot is hanging out on irc.freenode.net in #foaf or maybe #whwhwhwh at the moment, though I’m off on holiday now and it’s not good at rejoining as yet.

RSS+events module

From Chris Heathcoate’s art rss aggregator I found an updated version of RSS+events module. I’m really pleased to see that it’s been updated so that the event is a ‘thing’ in itself and isn’t confused with the webpage describing it, and that geo:Point has been added for geographical data. Couple of problems: I don’t think it’s legal RDF any more (the geo:Point part should use a property); and start and enddate semnatics and modelling means that it’s not possible to roundtrip from iCalendar, which I think is a shame.
So the geo:Point part could be fixed by doing this:
<item rdf:about=”http://www.oreilly.com/catalog/progxmlrpc/”&gt;
<title>Programming Web Services with XML-RPC</title>
<link>http://www.oreilly.com/catalog/progxmlrpc/</link&gt;
<ev:item>
<rdf:Description>
<ev:startdate>2001-06-20</ev:startdate>
<ev:type>book release</ev:type>
<ev:location rdf:parseType=”Resource”>
<geo:lat>39.04</geo:lat>
<geo:long>-95.69</geo:long>
</ev:location>
</rdf:Description>
</ev:item>
<dc:subject>XML-RPC</dc:subject>
<dc:subject>Programming</dc:subject>
</item>
OR
<item rdf:about=”http://www.oreilly.com/catalog/progxmlrpc/”&gt;
<title>Programming Web Services with XML-RPC</title>
<link>http://www.oreilly.com/catalog/progxmlrpc/</link&gt;
<ev:item>
<rdf:Description>
<ev:startdate>2001-06-20</ev:startdate>
<ev:type>book release</ev:type>
<ev:location>
<geo:Point>
<geo:lat>39.04</geo:lat>
<geo:long>-95.69</geo:long>
</geo:Point>
</ev:location>
</rdf:Description>
</ev:item>
<dc:subject>XML-RPC</dc:subject>
<dc:subject>Programming</dc:subject>
</item>
Start and Enddate.
The semantics of no timezone for these in iCalendar RFC 2445 are that where no timezone is specified the event starts (or ends) at the same time everywhere. So if we had <ev:startdate>2001-06-20T10:00:00</ev:startdate> that would be 10 am at all timezones. With a timezone (according to W3CDTF), we have
<ev:startdate>2001-06-20T10:00:00Z</ev:startdate> or
<ev:startdate>2001-06-20T10:00:00-0500</ev:startdate>
i.e. at 10am in a particular timezone.
This is a less significant issue when only dates are considered and not times.
As the authors of the new version of the events module indicate, W3CDTF version of timezones is not sufficient to calculate the current time because it cannot handle the change from daylight savings time. This is why the iCalendar specification says that you must include a timezone identifier and a description of the timezone in the same file.
This is why in the competing RDFical we have made the dates objects so we can attatch timezone information to them. This brings its own significant issues, but works with iCalendar so that roundtripping is possible.
Is roundtripping important? I think so…there are so many tools using iCalendar, including Outlook, Evolution, Mozilla Calendar, Apple iCal, and many mobile phones, that it makes sense to be able to convert back into a form that all our other devices and applications understand.

too…much…fun…

goldfish in mercury in tom quad
I’ve had a great time this week.
Thursday Craig and Liz had their leaving do before they went to Australia. Martin was there, matt, big james and kath, and Rhona.
Friday was my college’s reunion where I saw my lovely friends Anna and Rachel and Jade, Steve, John, Martin, my excellent brother in law Al, and the very fun Gaz and Raz. And some goldfish
Gaz has a very true writeup of the reunion. Like Gaz, I met (and photographed) the Lib dem MP for Oxford West.
Without stopping for breath, on Saturday we had sean, Morten and Jim staying, and on Sunday we met with nick and dave, jeni and norm, paul, kal, for some lovely food and beer, at a meeting to plot a coup-de-main and taking the TAG hostage. Not really. Well maybe a small demo ;)
Monday brought a trip to the Prince of Wales, a man and his ferret and conkers.
Finally, Tuesday was ILRT awayday, with thinkin’, ‘fun’, and a trip to the Hope and Anchor with paul and others.
Little rest now I think.

internationalization, rdf parsers, postgres and mysql

I have spent several days in the past few weeks trying to get internationalization working with my RDF toollkit-to-be, and it should have been easy, as I do not have my own parser and was using ARP and Rio, which both have internationalization support. They did the hard bit so we don’t have to :)
Anyhow, I’m extremely chuffed that I’ve just managed to get internationalization working with the inmemory version and with postgres, which seems to just work, provided you create a
./initdb -E UNICODE
or to check that your database is compatible,
./pg_encoding UNICODE
./pg_encoding UTF8
Mysql seems to require 4.1, so I’ll do that another day. So, a few notes on what I did.
I was getting ??????? printed instead of any non-English characters in foaf files; Japanese, Arabic and French. I figured it was a problem with my java code because I knew that both parsers I was using were good. So I spent a lot of time doing things like this:
String lit1=new String(((Literal)val).getLabel().getBytes(“UTF8″));
before I started processing from the parser, to try and track the problem down. Java is supposed to use unicode by default so it should have been ok, but I found a bunch of examples like this, and tried it, but no dice.
Anyway, turns out it was a combination of my terminal not supporting UTF-8, my locale on my debian box not having been set up, and (I think this was the most important bit) my jsp pages not being set up to display UTF-8.
sigh.
This is a useful page about setting your locale in Debian, also part of the java tutorial on internationalization helped my realize my terminal wasn’t displaying UTF-8 properly. Using xterm like this:
LC_CTYPE=en_GB.UTF-8 xterm
made me realize that encoding was coming through the parsers (although that command won’t display Japanese or Arabic), and focus on getting webpages to display correctly, rather than command-line tools.
For jsps you seem to need two bits of information:

at the very top of the page, and
<head><meta http-equiv=”Content-Type” content=”text/html;
charset=utf-8″>
doesn’t seem to go amiss either.
So I just tried outputting html from my tests and then tried it on jsps and then – hurrah! – it worked :)
[later, 2003-10-15]
I just found The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) which clarifies a lot for me. Very nice.