Displaying Guardian book reviews for quick buying on Amazon

I read the Saturday Guardian every week, and quite often buy a bunch of books reviewed in it. But equally, I don’t buy quite a lot of them as they’re only available in expensive and bulky hardback (plus I resent being market segmented like that, sorry). The Guardian’s reviews are very good but they only really review hardbacks in any depth or breadth, so it’s hit and miss whether I actually get to read any of them by the time they get to paperback. I just forget. I bet a lot of people do this.

Anyway, a couple of months ago I realised there was a Guardian content API as well as a data API. I applied for a developer key and, to my surprise, got one (the docs said they were giving out very few). This weekend I finally got around to having a play with it. It’s pretty neat. I’ve not explored it very thoroughly – I’m sure people can think of much more profound applications to make – but for book reviews there is lots of interesting data, and it’s available in JSON and XML.

My initial plan was to programmatically create an Amazon list – but this isn’t possible using the Amazon ECS API. However it is possible to search (on books, title, and authors) and get XML back, including a link to the Amazon page that describes it. I made a very simple page that does a request for book reviews with the appropriate date, and then for each result returned, identify the author and title and do an Amazon lookup to get the URL (I just pick the first one returned – I’m feeling lucky). It’s not as covenient as I’d hoped, but it does make it that tiny bit easier to

  • Buy things from the list straight away
  • Put things that are only available in hardback into my wishlist so I don’t forget about them

There are a couple of issues:

  • The title and author aren’t available as separate fields in the Guardian API. Usually the linktext is very formulaic and the information can be parsed out of that, but sometimes there are non-standard items and these fail
  • Characters with accents are returned as HTML entities so those need to be swapped back to characters in order to do the Amazon search
  • There’s no data about whether the book is in paperback or not, annoyingly. Amazon seems to mostly return the paperback version first if available, but maybe this is just good luck, and it probably needs more thought

The result isn’t too bad though and maybe I’ll buy a few more books. The Ruby code is here – you’ll need your own API keys for the Guardian and for Amazon though (they are both free and you can just get an Amazon one if you have an account with them)

Generating specs from RDFS / OWL docs

I’ve been hacking away at danbri’s version of specgen so we can rev the foaf spec. The idea is that you take an RDFS / OWL schema and generate some human-readable HTML from it, by taking the classes and properties and writing out their basic constituents. Optionally you can add some introductory text in a template, plus some individual bits of text for each property and class, eventually in different languages too.

I slapped in some RDFa yesterday because we needed a replacement for the ugly addition of RDF directly into the html, which makes it invalid. I realise some people may think this is back to front, but the foaf spec’s ‘original’ format has always been RDFS/OWL so it makes sense for us. I’m not actually sure we need two RDF versions (as there is alternate pointing to RDFS/OWL version from the HTML) but heck why not, and danbri’s consulting the community so there’s probably an argument I’ve missed.

There are several specgens available and at some point it might be nice to rationalise, or maybe go for functional equivalence. These are probably better in some senses than the one I’ve been working on, especially as I’m new to Python.

The ones I’ve found:

I think the two things that unite the first three is that they are (a) self-described hacks (b) in python. The Foaf one uses RDFlib rather than Redland because danbri was having trouble with Redland installation on the Mac I believe.

Next things I’d like to look at are

  • Generating specs from sample data (maybe someone’s done this already? It wouldn’t be complete but could be a start)
  • Defining application profiles or Argots and using them to generate, say, useful Sparql queries
  • Pictures!

CharBotGreen for Identica

CharBotGreen is stilll suspended on Twitter but fortunately she’s still announcing away on Identi.ca.

It’s trivial to move a bot from one to the other. In the source for CharBotGreen there’s a line

u = "http://twitter.com/statuses/update.json"

Using the Twitter-compatible Identica API you I can just replace that line with:

u = "http://identi.ca/api/statuses/update.json"

The only thing to watch for is that Identica stores names as lowercase and the authorisation fails if you don’t send it in lowercase.

Doesn’t work in Identi.ca:

req.basic_auth 'CharBotGreen', 'sekret'

works in Identi.ca:

req.basic_auth 'charbotgreen', 'sekret'

Thats it though – easy!