PlanB

Some FOAF stats

September 8, 2009 · Leave a Comment

Some FOAF stats from Sindice for something I had to write last week.

All classes

“Agent”, 3.84 million
“Document”, 6.15 million
“Group”, 5.78 thousand
“Image”, 711.23 thousand
“OnlineAccount”, 15.47 thousand
“OnlineChatAccount”, found 324
“OnlineEcommerceAccount”, found 242
“OnlineGamingAccount”, found 240
“Organization”, 10.05 thousand
“Person”, 2.64 million
“PersonalProfileDocument”, 11.7 thousand
“Project”, found 726

All properties

“accountName”, 8.02 thousand
“accountServiceHomepage”, 7.24 thousand
“aimChatID”, 9.54 thousand
“based_near”, 7.35 thousand
“birthday”, 2.48 thousand
“currentProject”, found 648
“depiction”, 696.31 thousand
“depicts”, 617.16 thousand
“dnaChecksum”, found 65
“family_name”, 2.46 thousand
“firstName”, 4.2 thousand
“fundedBy”, found 237
“geekcode”, found 107
“gender”, 15.8 thousand
“givenname”, 24.17 thousand
“holdsAccount”, 9.88 thousand
“homepage”, 1.22 million
“icqChatID”, 22.8 thousand
“img”, 684.38 thousand
“interest”, 64.77 thousand
“isPrimaryTopicOf”, 1.54 million
“jabberID”, 2.98 thousand
“knows”, 1.08 million
“logo”, found 374
“made”, 1.97 million
“maker”, 1.97 million
“mbox”, 3.7 thousand
“mbox_sha1sum”, 43.9 thousand
“member”, 5.53 thousand
“membershipClass”, found 58
“msnChatID”, 7.68 thousand
“myersBriggs”, found 154
“name”, 1.77 million
“nick”, 96.7 thousand
“openid”, 80.24 thousand
“page”, 5.84 million
“pastProject”, found 179
“phone”, found 999
“plan”, found 139
“primaryTopic”, 278.11 thousand
“publications”, found 202
“schoolHomepage”, found 644
“sha1”, found 60
“surname”, 25.32 thousand
“theme”, found 282
“thumbnail”, 2.51 thousand
“tipjar”, found 73
“title”, 2.02 thousand
“topic”, 3.13 million
“topic_interest”, found 90
“weblog”, 300.06 thousand
“workInfoHomepage”, found 505
“workplaceHomepage”, 1.68 thousand
“yahooChatID”, 6.72 thousand

→ Leave a CommentCategories: Uncategorized

Displaying Guardian book reviews for quick buying on Amazon

June 28, 2009 · 2 Comments

I read the Saturday Guardian every week, and quite often buy a bunch of books reviewed in it. But equally, I don’t buy quite a lot of them as they’re only available in expensive and bulky hardback (plus I resent being market segmented like that, sorry). The Guardian’s reviews are very good but they only really review hardbacks in any depth or breadth, so it’s hit and miss whether I actually get to read any of them by the time they get to paperback. I just forget. I bet a lot of people do this.

Anyway, a couple of months ago I realised there was a Guardian content API as well as a data API. I applied for a developer key and, to my surprise, got one (the docs said they were giving out very few). This weekend I finally got around to having a play with it. It’s pretty neat. I’ve not explored it very thoroughly – I’m sure people can think of much more profound applications to make – but for book reviews there is lots of interesting data, and it’s available in JSON and XML.

My initial plan was to programmatically create an Amazon list – but this isn’t possible using the Amazon ECS API. However it is possible to search (on books, title, and authors) and get XML back, including a link to the Amazon page that describes it. I made a very simple page that does a request for book reviews with the appropriate date, and then for each result returned, identify the author and title and do an Amazon lookup to get the URL (I just pick the first one returned – I’m feeling lucky). It’s not as covenient as I’d hoped, but it does make it that tiny bit easier to

  • Buy things from the list straight away
  • Put things that are only available in hardback into my wishlist so I don’t forget about them

There are a couple of issues:

  • The title and author aren’t available as separate fields in the Guardian API. Usually the linktext is very formulaic and the information can be parsed out of that, but sometimes there are non-standard items and these fail
  • Characters with accents are returned as HTML entities so those need to be swapped back to characters in order to do the Amazon search
  • There’s no data about whether the book is in paperback or not, annoyingly. Amazon seems to mostly return the paperback version first if available, but maybe this is just good luck, and it probably needs more thought

The result isn’t too bad though and maybe I’ll buy a few more books. The Ruby code is here – you’ll need your own API keys for the Guardian and for Amazon though (they are both free and you can just get an Amazon one if you have an account with them)

→ 2 CommentsCategories: books

Generating specs from RDFS / OWL docs

June 6, 2009 · 3 Comments

I’ve been hacking away at danbri’s version of specgen so we can rev the foaf spec. The idea is that you take an RDFS / OWL schema and generate some human-readable HTML from it, by taking the classes and properties and writing out their basic constituents. Optionally you can add some introductory text in a template, plus some individual bits of text for each property and class, eventually in different languages too.

I slapped in some RDFa yesterday because we needed a replacement for the ugly addition of RDF directly into the html, which makes it invalid. I realise some people may think this is back to front, but the foaf spec’s ‘original’ format has always been RDFS/OWL so it makes sense for us. I’m not actually sure we need two RDF versions (as there is alternate pointing to RDFS/OWL version from the HTML) but heck why not, and danbri’s consulting the community so there’s probably an argument I’ve missed.

There are several specgens available and at some point it might be nice to rationalise, or maybe go for functional equivalence. These are probably better in some senses than the one I’ve been working on, especially as I’m new to Python.

The ones I’ve found:

I think the two things that unite the first three is that they are (a) self-described hacks (b) in python. The Foaf one uses RDFlib rather than Redland because danbri was having trouble with Redland installation on the Mac I believe.

Next things I’d like to look at are

  • Generating specs from sample data (maybe someone’s done this already? It wouldn’t be complete but could be a start)
  • Defining application profiles or Argots and using them to generate, say, useful Sparql queries
  • Pictures!

→ 3 CommentsCategories: foaf · rdf

CharBotGreen for Identica

June 3, 2009 · Leave a Comment

CharBotGreen is stilll suspended on Twitter but fortunately she’s still announcing away on Identi.ca.

It’s trivial to move a bot from one to the other. In the source for CharBotGreen there’s a line

u = "http://twitter.com/statuses/update.json"

Using the Twitter-compatible Identica API you I can just replace that line with:

u = "http://identi.ca/api/statuses/update.json"

The only thing to watch for is that Identica stores names as lowercase and the authorisation fails if you don’t send it in lowercase.

Doesn’t work in Identi.ca:

req.basic_auth 'CharBotGreen', 'sekret'

works in Identi.ca:

req.basic_auth 'charbotgreen', 'sekret'

Thats it though – easy!

→ Leave a CommentCategories: Uncategorized

Web Unperson

May 30, 2009 · 7 Comments

A couple of times this week people pinged me to say their browser was reporting my site as a phisher like this. I thought little of it since we’d been hacked before on Dreamhost and WordPress and asssumed we had got on a blacklist somewhere. I rechecked the site, couldn’t find anything, and reported it as an error.

Last night though I found that my twitter bot, CharBotGreen had been suspended as a phisher, and tonight I find I’ve been suspended from twitter too. This is a bit of a blow, and the cause in both cases seems to be that I linked to my blog.

Using Google webmaster tools I discovered that several pages had links to viagra etc pages on them, invisible except in the source, with html inserted through the header php. Firefox and Safari made it difficult to find this out by inserting buggy ‘this is a phisher’ text (with broken links) over the source as well as the page itself.

I’ve now moved off Dreamhost completely – though it might have been simply that I had not updated WordPress enough. I’m on wordpress.com now, so I hope that’ll remove this riskiness.

The whole episode has made me rather depressed. Google has basically killed my online identity. I’m on various lists asking to be taken off, but there’s been no movement since last night, and I had no warning. It seems that there’s a blacklist being used in both cases, not competely sure what it is yet.

Anyway, if it happens to you, take it seriously and deal with it as soon as you can.

Update: I’m actually not on google’s suspended list any more. Hurrah! But still no Twitter. Guess it’s time to move to Identica with that and the madness of #fixreplies. Meh!

2nd Update: I got my Twitter account back this morning (2nd June, 3 days later). CharBotGreen is still suspended.

Useful links:

Google – My Site’s been hacked
Google webmaster tools
Google apps admin page: Google MX Records

→ 7 CommentsCategories: Uncategorized

iPhone working with PoGo

May 26, 2009 · 3 Comments

I’m so chuffed about this -

I bought a Polaroid PoGo inkless bluetooth mini sticker printer having been entranced by psd’s one, but knowing it didn’t work with the iPhone and that I’d have to get my laptop out to print anything. The PoGo is a lovely toy but I was getting a bit irritated by this limitation. The problem was twofold:

  •  iPhone bluetooth is crippled – you can only use bluetooth headphones, and not use it for file transfer. Annoying.
  • iPhone stores pictures as (peculiar) pngs and PoGo only accepts jpegs (which I found by trial and error – I can’t find any PoGo docs on that at all)

The first issue was easily solved – I have a jailbroken 1st gen phone and I just installed iBluetooth with Cydia, which is a app installer based on .deb packages.

The second was more tricky. I looked at ImageMagick for iphone (it’s on Cydia) but didn’t get anywhere. I think I needed to install gcc which was a step too far. Instead I put ssh on it (pretty cool in itself), found some hints on the web, and found that iPhone actually creates jpgs as well as pngs (in /private/var/mobile/media/DCIM/100APPLE – the pngs are in /private/var/mobile/media/DCIM/999APPLE). Weird! Anyway, iBluetooth allows you to browse the filesystem, and send files you find there, and that worked.

So all you really need is iBluetooth as it turns out. Hope this is useful to someone.

→ 3 CommentsCategories: Uncategorized
Tagged: , , ,

Companies House XML and Rewired State

March 12, 2009 · Leave a Comment

I was at Rewired State last weekend and so a week or so ahead, I got around to applying to an XML Gateway account in order to get some interesting data out of there – this blog was supposed to be about a few technical aspects of using the gateway, but first, I hope you’ll forgive a shortish rant about the difficulties of getting data from Companies House and the highly annoying economy around the Companies House data. If you like, skip to the technical bit.

Companies House Data

First a little background. Companies House contains all details about all the companies in the UK, including names, company number (their primary identifier), status (if suspended, function, in liquidation etc), the official filings of the companies such as annual reports, and information about company directors and other appointments, including usually, the home addresses of the directors (except for some exeptions for security concerns, MPs and the like). You can get some of this information for free, and some you have to pay a bit for, either as XML or RTF.

Companies house has a SOAP gateway, called the ‘XML gateway’. It’s a pretty simple SOAP interface with good documentation (pdf). The costs are the same as for the RTF format – a pound a piece for the more interesting bits, free for the basic information (still interesting) but you pay 6 quid a month for access (prices), which seems pretty reasonable. It does however take a few days to get the account, as it’s a credit account designed for businesses who want to resell the information, so you need to get a temporary account, do a test, then apply on paper to create a direct debit; they aim for 5 working days maximum from when they get the forms. I sent mine last Friday, it got here today, Thursday, so by my calculation they made it with a day to spare.

So I misunderstood what the timings meant (not 5 days from first contact), so it became clear by last Friday that my XML account wouldn’t be ready in time but I wanted to show what would be done, if we had that information.

Now, in theory at least anyone can buy this information from Companies House directly, on the day needed and available immediately using their WebCheck service, (which bizzarely claims only to be open 7am to midnight). Reports and lists of directors cost a pound apiece and basic information about a company is free on the site (name, company number, main contact person and address, and status). In practice it’s a laborious task to actually get the information about the company, partly because the site’s fairly unusable, partly because sometimes it’s hard to know which specific company you are interested in because there are so many with similar names. Companies Open House is trying to remedy some of these issues

My interest was in getting a few lists of directors in order to demonstrate foafcorp UK, a kind of data-focused They Rule at Rewired State. I bought a few (which went fine, uses a credit card and worldpay) and then tried to download them. You search for it in the web interface, which is hard to use because it ’s very stateful and you can’t link to aparticular company; you create a login; you buy it (‘Appointments report’), you get an email about it straight away, and the report gets put in your ‘download area’. Clicking on this makes a window pop up instructing you to rightclick and download. This is because it’s an ftp file! Anyway, the problem is that you can’t download it:

curl -O ftp://wck2.companieshouse.gov.uk/image/5b/29/c7/1b/d1/75/e7/c3/42/77/da/1f/b0/bd/c0/60/repA_01631639_506-143015-03619061_12.rtf


% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:03:13 --:--:-- 0
curl: (56) FTP response reading failed

I started to think that maybe the FTP ports were blocked – not impossible since the Rewired State team had had to ask for specific ports to be opened, and some of the guys at lunch explained to me that ftp was quite complex in terms of ports, choosing random ones – so I tried from a remote machine, but still nothing.

I did, finally, manage to get 4 of the 17 I’d paid for down, just be repeatedly trying with curl. An answer to my email enquiry (phones are only open Monday – Friday) came on Monday. It said you needed to wait for an hour before downloading as the documents could appear to be there when they weren’t. I’m not sure that this was the problem because I’m still having the issue (the reports are available for 10 days). But it seems fairly clear that few people are using this technique to get his information.

Instead they use the various resellers of that information. Try doing a search for “uk company directors” and see what you find for sponsored links. You’re tempted in by free searches (the same information available for free on the Companies House site) and then you can buy what looks like the same information as costs a quid on the CH site for between 6 – 10 pounds. I’m wondering if these companies are simply using the XML gateway to create these reports? hm.

Ruby and SOAP and the XML Gateway

Anyway, down to the technical stuff. You can apply right away for a temporary usename and password to access the XML gateway. The CH people are very efficient and you get this pretty much right away by email. This is the same as the real system except you only get information about one company back, so you can test the SOAP interface, and you can show that you could actually use it.

I looked at some SOAP libraries for Ruby but then just decided to try with net/http, uri and post. Probably not ideal, but time was short, the libraries were undocumented (even by the standards of Ruby) and this was a very quick way of seeing if their sytsem worked and what information it would return.

The basic idea was posting some data to a url:


require 'rubygems'
require 'net/http'
require 'uri'
require 'open-uri'


def Data.post(u,data)
begin
puts "Checking url #{u}"
url = URI.parse u
http = Net::HTTP.new(url.host, url.port)
res, body = http.post(url.path, data,
{'Content-type'=>'text/xml;charset=utf-8'})
case res
when Net::HTTPSuccess, Net::HTTPRedirection
puts "response #{res.body}"
else
puts "problem"
end
rescue URI::InvalidURIError
puts "URI is no good"
end
end

the url (the soap endpoint) for CH is


http://xmlgw.companieshouse.gov.uk/v1-0/xmlgw/Gateway

This method just prints out the XML response you get back (res.body) – but you could then use an XML parser like Hpricot to get the data out after that (Hpricot’s really an HTML parser so isn’t great at namespaced elements, but it can do them – and CH XML doesn’t have namespaces anyway).

The other part you need to do is authentication, again very simple. CH uses the name and password they give you, plus a random transaction identifier you provide:


require 'digest/md5'


user = "XMLGatewayTestUser" #you need to request these from CH; or ask me nicely
pass = "XMLGatewayTestPass"
transactionId = rand(7)
digest = Digest::MD5.hexdigest("#{user}#{pass}#{transactionId}")

Then you just slot the digest into the XML that you need to send. There are lots of examples of the XML and the general documentation is in the this Data Usage guide PDF. The FAQ is also useful.

You can see the code I made here. It’s not nice, but does get the results back. Here’s some samples: search, directors, details.

And that’s it!

→ Leave a CommentCategories: foaf
Tagged:

CharBotGreen – a Twitter Radio 4 announcement bot

February 3, 2009 · 2 Comments

I wanted to try out the Twitter API and since I was finding myself repeatedly going through the tedium of flipping browser tabs to see what was on Radio 4, I figured I’d make a bot that tweeted what was on Radio 4 instead. This had the added advantage that I could use some half-written code I’d started for a more complex event bot that was turning out to be too hard. I neglected to do a twitter search, however, which would have shown me that there were at least two similar services already working. Ah well. Here’s CharBotGreen

Thanks to: Damian for the name and technology suggestions, @psd for the picture, and Charlotte Green for being a great Radio 4 announcer (as are they all!)

Be warned – do not use my Ruby code as an example of good practice, as it most certainly is not.

What it does

Once a day – pulls down the Radio 4 programmes json (details – what an excellent service that is – beeb++) – and stores it in an H2 database like this, having wiped the database over night (sometime between 1am and 5.20am, when it’s on the world service and no detailed schedule is available anyway):


CREATE TABLE if not exists beeb(DT TIMESTAMP, PID VARCHAR(8), D DATE, T TIME, NAME VARCHAR(255));

So basically I start the Radio 4 day with an SQL representation of today’s schedule page. I started with PID as UNIQUE but then realised that the same PID could be broadcast twice a day.

Every 5 minutes – checks in the database for anything starting in the next 5 minutes and sends a tweet, either ’starting now’ or ’starting in a few minutes’ depending on the exactness of the match

SELECT * FROM beeb WHERE D = '#{d}' AND T >= '#{t}' AND T < '#{t1}';

where t is the current time and t1 is the time in 5 minutes (d is today’s date).

Technology

I use ruby and H2 over JDBC. You can see the every 5 minutes and daily scripts and the readme.txt. Why these technologies? Well, I wanted to learn Ruby and using Jruby means that you can use many ruby libraries but you can also access Java classes which is handy for using the H2 database. Why H2? well it’s a self contained, in-memory, SQL-compatible database written in pure Java, so I could keep everything in one directory. For something this lightweight there’s almost no point in using SQL but I wanted it for something a little more complex as well so it made sense (and makes it nice and easy). I use Json pure for the json parsing (it has to be pure to use it with Jruby). If you want to use Ruby rather than Jruby the SQL bit will take some fiddling with; the rest should be ok as is.

Hashtags

I jumped into a little chat on twitter about what hashtags to use and settled on #pid: and then the PID (such as b00h4r7x). I’m still not sure about this; I put the URL in as well.

It’s all super-simple

But good fun to do. Psd suggested that some Charlotte Green-style amusing incidents would be fun to put in there, though I’ve not worked out how to do that. Another improvement would be if it gave you a little more notice about what’s coming up as @bbcradio4live does.

→ 2 CommentsCategories: Uncategorized
Tagged:

Expand tinyurls using ruby

February 2, 2009 · 2 Comments

Just a tiny thing but handy and I couldn’t find it anywhere else (I’m new to Ruby and I’m coding by google so don’t expect great style here, but this seems to work):

require 'net/http'
require 'uri'

url = URI.parse "http://bit.ly/1Zw502"
if url.path.size > 0
  # catches case where you get an url like
  # http://planetrdf.com with no slash
  # this catches this but doesn't look it up
  req = Net::HTTP::Get.new(url.path)
  begin
  res = Net::HTTP.new(url.host, url.port).start {|http| http.request(req) }
  case res
  when Net::HTTPRedirection
   uu = res['Location']
   puts uu
  end
 end
end

uu is the expanded url. Is there a better way than this? Is it me or is Ruby documentation a bit thin on the ground? (Thanks to Damian for pointing out to me how tinyurls work – I’d never bothered to look before!)

→ 2 CommentsCategories: Uncategorized
Tagged:

W3C Workshop on the Future of Social Networking (2)

January 14, 2009 · 3 Comments

As promised below is part two of my mini-reviews of papers submitted to the W3C Workshop on the Future of Social Networking, including the three late papers, an interesting related paper by google (pdf), and Danbri’s take on Foaf in 2009. The workshop starts tomorrow. Part one of my reviews (papers 1-42) is here.

Most interesting to me: 43, 45, 50, 56, 57, 60, 61, 63, 64, 67, 68, and especially 52 which raises a lot of important points about what can or cannot be done with your harvested data (if anything).

Themes: there are an awful lot, and the program committee have done a good job in turning such a bunch of disparate material into a agenda and set of discussion points.

My take on the main themes from the papers:

  • data silos problem and solutions; portability of data, policies and permissions
  • trust, authentication and permissions
  • semantic activity streams
  • ownership of data created by networks; what can be done with it; data mining; creative commons for personal data
  • identity across sites; mobile operators as brokers
  • location awareness, apis or markup;
  • context awareness, sensors and apis or markup for these
  • accessibility and web 2.0
  • business models
  • best practices documentation

Technologies:

foaf, oauth, openId, sioc, Dataportability, hcard, vcard, atompub, xdi, NFC, Doap, opo and similar, openDD, OMB … and many more

Reviews 43 – 72

43. FOAF & SSL: creating a global decentralised authentication protocol Henry Story, SUN

Protecting rdf resources using foaf and ssl. Idea is that the user can identify themselves using an ssl certificate in their browser which refers to their dereferencable id (the #me in their foaf file) which means that the public key in the foaf file can be checked against the one in the certificate, and then access granted or not depending on some friend-related or other algorthm. Interesting, and has several implementations (what could be the relationship with openID and oauth, if any? are they all complementary?).

44. Managing Social Communications Identities (pdf) Óscar M. Solá, Telefónica I+D

Insteresting idea about linking users’ social and communication identifies in a secure and private and configurable way by a ’social broker’. The idea being that you don’t have to know the phone number of a person, or their email, in rer to be able to contact them (provided that they have specified that you can contact them).

45. Current issues with Social Network Representations (pdf) Peter Mika, Yahoo! Research

Describes a view of a company getting to grips with using semantic markup: and the phases of microformats, rdfa; the need for mapping between Foaf and Vcard, lack of best practices for some types of rdf vocab mixing. Argues that vocabs should be produced using existing data about what people are willing to expose. Argues also that aspects of rdf are too hard to grasp or communicate. Emphasis on agreements on how to use existing things rather than creating more formal standards. Interesting to see a commercial point of view in this area

46. Social Networking Segmentation: Celebrating Community Diversity in a Framework (pdf) Christine Perey, PEREY Research & Consulting

Characterises different kinds of communities offering different sorts of experiences or services for mobile and static devices (and things in between). There are two classification systems: why the user is there (professional reasons, entertainment) and complexity of features (which are often related to whether the network is aimed at mobile or static devices). Argues that a widely used classification system would allow networks to “comunicate with their target market segments” and differentiate themselves more quickly.

47. It’s all around the domain ontologies – Ten benefits of a Subject-centric Information Architecture for the future of Social Networking (pdf) Lutz Maicher, Benjamin Bock, Topic Maps Lab at University of Leipzig, Germany

Argues that developing social sites starting with domain ontologies with object identity in topicmaps or RDF makes development easier in multiple ways (e.g. ontological flexibility in development, easier to localise, identity awareness.

48. Social Networking: Power to the People (pdf) Stefano Bortoli, Paolo Bouquet,Themis Palpanas, University of Trento, Italy

Argues that users should own their data and be able to move it around rather than being locked in to a particular social network. Argues that foaf is not sufficient for the needs of a decentralised network because: doesn’t make enough distinction between types of relationships; is public to all; doesn’t provide a solution to identifying people and other things uniquely. They are building tools under the OKKAM EU project, e.g. http://www.foaf-o-matic.org and a distributed system for generating and storing unique identifiers on the web (hm, what’s wrong with URLs?)

49: A Telecom Italia view on the future of Social networking (pdf) Claudio Venezia, Telecom Italia

Clear statement of the company’s interest in areas of standardisation or endorsement that W3C could undertake, in the araes of identity, portability, privacy, user experience; e.g. endorsing openID, foaf and sioc or similar standardisation and specialised URI schemes; endorse or create something like IdM or oauth; best practices for mobile user interfaces, and several more. Plus a plea to bear in mind that these networks need to be monetized. Worth a look.

50: Beyond Eyeballs: Improving Social Networking Metrics (pdf) Christine Perey, PEREY Research & Consulting

Argues that current metrics for evaluating social networks do not make for very interesting or useful analysis (page impresssions / month and new accounts). Suggests that a common framework would allow better allocation of resources and if shared would enable better comparison of sites. They suggest types of user: joiners, collectors, critics, creators; user profile metrics (e.g. ‘gardening’, ‘policing’; ‘giving’ and ‘receiving’ actions respectively examples could be rating and viewing others’ contributions) and various others (number of friends, various frequencies). Also 17 community metrics (e.g. user funnel, total number of pieces of content added per month, percentage of different users types). Interesting because these types of stats drive allocation of resources in many companies.

51. NewBay Position Paper on Mobile Social Networking Stephen Farrell, Bill de hOra, NewBay

A list of recommendations for w3c action is at the end, as “a software and services provider to mobile network operators”. Their view is that the issues with social networks on mobiles have to do with user interface issues (e.g. web 2.0 self-updating pages etc) and ‘irritation issues’ – a constant stream of events may be irritating in a mobile context when it is not in a static context. They suggest that brokerage by mobile operator may be the way forward to transmit preferences of the user. Not sure I completely understand why this is the best option though.

52. Social Networks as a Future Geographical Data Source (pdf) Ian Holt, Jennifer Green, Ordnance Survey of Great Britain

As a data vendor, the OS has been researching data mining in social networks (in this case to extract vernacular placenames as areas on a map). They are interested in the legal and intellectual property questions raised by this, the possibilty of standardising something like dataportability.org for data sharing standards; and whether such data is a marketable commodity, and what the need for anonymity is in these cases. Very very interesting questions indeed.

53. Open Platform for Multichannel Media Distribution Management Roberto García, Juan Manuel Gimeno, Universitat de Lleida, Spain

Describes an EU project to create an open platform based on semantic web technologies for the distribution of content from small and medium content providers. It will have digital rights management features based on a copyright ontology and will use user tracking rather than DRM. Not immediately clear to me how this is relevant to the workshop.

54. Mobile Social Networking: Two Great Tastes John Kemp, Franklin Reynolds, Nokia

Describes various aspects of mobile phones that makes social metworks on mobile phones different to social networks on static devices on the web. Interest in radio capabilties like bluetooth, GPS; capacity to interact with the real world using 2D barcodes; and they’re always with us. Privacy implications of phone number as a unique identifier. Suggests a distributed architecture for social networking using the processing power of these devices and not dependent on an always-on connection. Argues this would need more interop between sites; doesn’t really explain why.

55. Social Networks in Life Sciences: Defining and Enabling Appropriate Roles to Create an Atmosphere of Trust and Security (pdf) Hans Constandt, Adrian Seccombe, Robert Sweet, Yijing Zhou, Susie Stephens, Eli Lilly

Interesting idea somewhat related to paper 52, about the possibility of using semantically enhanced data from social networks of individuals with a particular disease, and similar questons of anonymity and tracability of using of this sort of data.

56. Towards an OpenID-based solution to the Social Network Interoperability problem (pdf) Michele Mostarda, Davide Palmisano, Federico Zani, Simone Tripodi, Asemantics

The paper describes a piece of software: an implementation of OpenId that can have connectors that connect the user to various social networks, including via the open social API, and aggregate their data from their networks, filtered if they like to different personas, on to one or more personal pages. the ppaer also talks abot a generalisation of this approach, termed the “Global Social Platform”. Interesting; a bit unclear to me whether the current system requires youy to give away your passwords or not though.

57. Collaborative Filtering and Social Capital Peter Ferne, Jiva Technology

Interesting summary of some aspects of social captial (‘whuffie’), including recommending people. Discussion of the complexity of measuring social capital. Idea that trustworthy systems systems require openness. Good set of links to follow.

58. Applying an XML Warehouse to Social Network Analysis (pdf) Benjamin Nguyen (University of Versailles), Antoine Vion (University of Aix-Marseille II), François-Xavier Dudouet (Université Paris-Dauphine), Loïc Saint-Ghislain (Ecole des Mines de Nancy)

Describes a project to analyse data from the W3C mailing lists, using XML databases and XQuery. They have used this to create networks via co-authors as well as other types of analysis. Interesting work, but not perhaps relevant as it stands to the workshop?

59. Mobile Eco-System: The Need for a Mobile Markup Language Nicolas Belloni, Mattias Rost, Future Applications Lab/Mobile Life Centre, Stockholm, Sweden

Argues for the need for a markup language for mobile services, for “absolute location, sensors, near-field communication, proximity of other users or services” to improve access for creative to this information. They are developing prototypes. Formatting the text would have been nice ;-)

60. Ten Theses on the Future of Social Networking Harry Halpin, University of Edinburgh

Paper describes the elements required for opening up data silos, arguing that the technologies are there already, and what’s needed is openID, oath and foaf; He argues that it’s in the interest of producers and consumers of data to have consistently structured data. He emphasises that apis should not preclude what is being used now, and that we should use data about what is being used as a basis for standardisation. He wants to use rdf, RIF and the W3C, cooperating with dataportability.org to void duplication; and to include provenance. Suggests a best practice recommendation. I’ve a lot of sympathy with his arguments – I’d like to see a sample implementation.

61. The Relationship Layer and the Secretary (pdf) Dewey Gaedcke, Minggl.com

Interesting and clearly argued short paper about how a secretary-like application which could prioritise and deprioritise and reroute information in a similar way to humans, by using statistics about how we interact with our peers. Argues that a minimum set of things needed are: global identity for a user and mapping to app-specific identities; open api and semantic event-type data ‘actionstory’.

62. Mobile Video Improvements to Enhance Mobile Social Networks (pdf) Tim Hyland, Dwipal Desai, YouTube

Argues that it’s important for social networking on mobiles that a consistent way of inline video playback that works on all handsets is decided on – doesn’t matter if it’s html5 video tag, flash or something else – but it is important, and users will expect it, as it’s so often used in static social networking.

63. Social Media in eGovernment John Sheridan (The [UK] National Archives), Kevin Novak (The American Institute of Architects), José M. Alonso (W3C/CTIC)

Explores some of the implications of government interaction in social networks. The ‘OS’ question again pops up – who owns the data created by these networks, can it be usedd for anything else, and how can it be anonymised if so – and what are the privacy implications? Interesting read; it’s come out of discussion at the W3C eGovernment Interest Group.

64. SIOC: Content Exchange and Semantic Interoperability Between Social Networks John G. Breslin (National University of Ireland, Galway), Uldis Bojārs (National University of Ireland, Galway), Alexandre Passant (National University of Ireland, Galway), Sergio Fernández (Fundación CTIC), Stefan Decker (National University of Ireland, Galway)

A paper describing the features of SIOC and how it interoperates with other onotologies, enhances site interoperability, and is used in multiple tools. SIOC describes idems at the level of containers and content items – blog, blogpost, items, bookmarks, comments. In the furure would like to get closer integration with OPO (online presence ontology, paper 12). Argues that for these reasons W3C efforts in this area should include SIOC.

65. Integrating Social Networks and Sensor Networks John G. Breslin, Stefan Decker, Manfred Hauswirth, Gearoid Hynes, Danh Le Phuoc, Uldis Bojārs, Alexandre Passant, Axel Polleres, Cornelius Rabsch, Vinny Reynolds, National University of Ireland, Galway

The 10(!) authors provide some usecases for sensors and social networks, and suggest that sensors can create semantic data about a user’s activities, and that they can extend and create social networks. They think that “some interaction between the Semantic Web and the Mobile community within a W3C group could be beneficial to this convergence”

66. Enabling Trust and Privacy on the Social Web Alexandre Passant (National University of Ireland, Galway), Philipp Kärger (L3S Research Center, Hannover, Germany), Michael Hausenblas (National University of Ireland, Galway), Daniel Olmedilla (Telefonica R&D, Madrid), Axel Polleres (National University of Ireland, Galway), Stefan Decker (National University of Ireland, Galway)

A discussion of trust and privacy and the relationship to the semantic web stack; they believe semantic web techniques could be used successfully for trust and privacy, for example to share photos of multiple sites to a small group. They are interested in policy-based approaches, and agreed models for defining policies and authoritativeness.

67. The Tangled Web We Weave (pdf) Greg Howard, Rajesh Kuppuswamy, Kaushik Sethuraman, Microsoft Corporation

Paper arguing that accessing social networks by mobile devices will require techiques either for agfregating multipel networks in a single UI or a way of quickly flipping between networks – and cruically matching up friends over networks for the user, on the device itself. This could be done in an automated way to suggest links, and also proivde a way to manually link them. Interesting.

68. Business Context Impacts on Social Networking Mary Ellen Zurko, Werner Geyer, IBM

Interesting again. Lotus describes two of its products that allow companies to manage their business contacts. They are interested in using openID to allow trusted partners to access this information, and perhaps open social. They want to be able to define different types of buiness relationships.

69. A Vision of an open Platform – The Enablers Perspective (pdf) Bastian Pfister, Roman Hänsler, aka-aki

A small, mobile social hetworking company wants to make a social networking tool that would enable users who meet in physical space to interact with each other on social networks. Requires social networks not to be data silos. Argues that consumers will expect this sort fo thing to work and will be surprised when it doesn’t. Additonally data costs need to fall.

70. Rethinking digital object, Rethinking information relevance (pdf) Yuk Hui, Centre for Cultural Studies/ Department of Computing, Goldsmiths, University of London

A highly philosphical short paper. He wants to investigate how relevant data is to an individual – in order to enhance ‘ambient findability’. The work is at an apparantly early stage and is not (yet anyway) relevant to the workshop.

71. DMM: Digital Me Management Karl Dubost, Olivier Théreaux

Outlines various issues about data silos, data ownership in these silos and various access control approaches. The authors’ position is that each of these need to be addressed.

72. Different groups with share agendas (pdf) Elias Bizannes

The author (who is vice-chair of dataportability group) outlines the groups he is involved with (open social open web foundation, DiSo), and states that they are not competing but offer overlapping and complementary benefits. The data portability project has spent 2008 establishing governance and process, and various things are now in progress: a tool to assess sites and companies with respect to open standards dataportability supports; a creative commons for personal information; a healthcare taskforce.

also:

Semantic enhancements for social networks Rigo Wenning, Ivan Herman, W3C

Argues that it is not simply enough to make data silos permeable but there’s also a need for the ability to move policies, reputation, traceability and privacy and access controls between networks.

Web 2.0 and the Visually Impaired Learners (pdf) Nantanoot Suwannawut

Describes various common problems with web 2.0 and accesssibility notably ajax updating bits of the page which screen readers are unable to track; some sites do not alow assistive devices to be connected; similarly there are no guarantees for new devices. Text alternatives are often not available for video content. Capchas are not usable. Large parts of the web are no longer accessible and so some groups risk exclusion.

Ubiquitous, social networks in the street (pdf) Marc Pous, Luigi Ceccaroni, Manel Palau, Victor Codina, TMT Factory

Describes a service that suggests personalised activities, people to do them with, and how to get there, using recommendations, sensors, social networks, time and geolocation. Needs objects, locations, people, events etc to be described in interoperable and portable ways.

and others:
(Under)mining Privacy in Social Networks (pdf) is worth a look – google on privacy issues with social networks – unexpected events in activity streams, accidental linking of personae, and datamining for merging. And using the social graph for mitigating these.

Finally, last but not least: Danbri’s: foaf in 2009 plan outlining goals such as: evaluate effects of people not making their own homepage by hand any more; best practice for sites which expose foaf; impact on regular users of aggregators; creative commons for personal data. Plus technical issues: vcard and portable contacts and foaf; foaf and openID, oauth, atompub, webdav, ssl, pgp; trust and provenance; crawler stats and rest apis to large aggregators; searchmonkey, google social graph. Aiming to have regular meetings (f2f where feasible and regular online meetings) and a decision process / calendar for the core vocab.

Phew.

Hope it all goes well!

→ 3 CommentsCategories: foaf · rdf
Tagged: