Thursday, 21 July 2011

Why linked data works

Why linked data works and what is the difference between semantic Web and Linked Data? Here is a famous presentation made by Tim Berners Lee in TED.

In my opinion, linked data is the way lead to the big vision of semantic Web. Around 10 years ago, when Tim firstly described the vision of semantic Web, we haven't got a clear idea how the goal could be achieved. Many attempts have been made since then: RDF, RDFs, Web Ontology Language, SPARQL query language, RDFa, Jena, Sesame, triple store, AKT project, etc. However, semantic Web was still mysterious "weapon" controlled only by a group of researchers. Massive Web users were not be able to enjoy the benefit described by Tim ten years ago.

Until recently, the coming up of linked data became the ice-breaker. More and more areas in everyday life starts to apply linked data. One famous example is the open government initiative ( So, where is the difference between semantic web and linked data?

The whole idea of semantic web is based on the mutual understanding of knowledge. The old way of reaching this mutual understanding is everybody sit down and have a chat in the hope that some day we can agree on certain vocabularies to describe the knowledge. I call it the top-down route to semantic Web. But, to be honest, is it possible? When I was doing my master degree in University of Southampton, the lecturer asked all the students to create an ontology to describe Pizza. Some students categorized pizza by vegetarian and non-vegetarian, while some of them categorized by the thickness of the pizza. Well, if we cannot agree on the description of a simple pizza, how can we agree on more complex concepts like planes and cars.

OK, the top-down route is not that realistic. So the pioneers of semantic Web found another way of doing semantic web, which is the idea of Linked data. In TED speech, Tim emphasized the importance of putting raw data online. This is the basic idea of linked data: I don't care what vocabulary do you use to describe the things you want to publish, just publish it! Well, THERE ARE four linked data principles ( that publishers should follow, but at least we don't need to agree on a complex ontology before we even could publish data. I call this bottom-up way.

So, next, why the bottom-up way works? Here is a metaphor. Two thousand years ago, there were languages like English, Chinese, French, etc. But Chinese didn't speak English and they were not even be bothered to learn English because Chinese didn't know British. With the development of the world's trading, Chinese, Japanese, French British have to do business with each other and here comes problem: we don't speak the same language! Somebody had the solution: let's create a completely new language which is "simple" enough for everybody for all countries to understand, which is just like the top-down method. Unfortunately, until now, all such efforts are fruitless. The real situation now is that English becomes the defacto "world language". I don't speak Japanese, but I can speak to a Japanese using English.  The basis of English becoming world language is the development of world trading and the communication among different cultures is necessary. So the idea of linked data is like trying to put products into the world market first and let the market choose how the knowledge representation can connect to each other. I can expect that like the English language, we should put data open online first. We should admit that there are diverse understanding of concepts and then worry about how to  reach a mutual understanding even though we decide to keep the diversity of understandings.

No comments:

Post a Comment