Friday, January 29, 2010

Frank van Harmelen tweets that a Linked Data tool/site named Uberblic has been released.

"OK, so what?" - that's become my common reaction to something 'new' in the semanticwebosphere. But I always read about it anyway, mainly just to make sure I'm keeping up.

But then I watched the video, and realised that maybe some of the Linked Data questions are finally starting to get answers. In particular, the problem of those in the wood not being able to see the trees: everyone making Linked Data (seen the cloud diagram recently?) has said it's vital to make data that links to the rest of the Linked Data and uses resolvable URIs, so that semantic browsers and applications can traverse from one link to another and do some fantastic inferencing and discovery.

Fine, except for the bit about 'semantic browsers and applications'. Where are they? The answer to that is: Tablature, OpenLink and the rest. But (usually) all these do is allow me (the human) to interact with the data in a pretty low-level way. What about the visualisations and applications that the semantic web promised? What's the point in making all this data linked if all we can ever do is manually traverse the links (quicker to click on an HTML page link) or just browse one source via SPARQL?

In summary: it's clear there's a lot of data out there, and a lot of ways to make data and put it out there (the likes of OpenCalais and entity-extraction systems are surely the future of large-scale document stores and Enterprise CMS systems), but what about actually using the data? Is anyone actually crawling the Linked Data web?

So what these Uberblic people have done (or at least seem to, from the vid) is provide a browsing tool that really does link this linked stuff together. The guy says "It provides a single point of access to data reconciled from data sources on the web. The service runs on the Uberblic Platform, an integration software for crawling, mapping, and fusing structured data." Watch the video.

And if they truly have done that, then maybe we're ready for the next steps. Maybe the semantic web can truly become query-able (to some degree) and maybe semantic apps that actually leverage all this linked data without knowing what's out there beforehand could become a reality.

Maybe then all those scripts I wrote to make convert Citeseer's dataset to RDF might be worthwhile.

Postscript: Here's an article from The Guardian touting the importance of Linked Data again without the specifics of what you're going to do with it (apart from the notion of bundling and selling the data to Estate Agents, which again reduces it to one actual source and no sense of discovery across uncurated data).

