Paul Miller is right… and so is Ian Davis
Paul Miller, a good friend and ex-colleague, has been having a tough time arguing that perhaps Linked Data doesn’t need RDF. Don’t misunderstand that, he thinks RDF is a Good Thing and Best Practice for Linked Data. But he thinks a dogmatic stance is unhelpful.
The problem, I contend, comes when well-meaning and knowledgeable advocates of both Linked Data and RDF conflate the two and infer, imply or assert that ‘Linked Data’ can only be Linked Data if expressed in RDF.
This dogmatism makes me deeply uncomfortable, and I find myself unable to agree with the underlying premise.
In the twitter stream that Paul links to there is some comment reminding people that RDF can take many forms, not just RDF/XML.
kidehen: @andypowe11 re. #rdf, it’s the data model for #linkeddata based #metadata. Remember #rdf != RDF/XML, no escaping RDF model re. #linkeddata.
Ian Davis (my boss) took a strong stance saying that if things weren’t RDF then they weren’t linked data. Perhaps the very thing Paul sees as a dogmatic stance. Ironic as Ian is far from dogmatic. But Ian is defending the term Linked Data, not saying that’s the only way to publish data on the web…
TallTed: @iand “I think LD better for many cases, but there are times i’d rather hv a spreadsheet.” What? Can a spreadsheet not hold #LinkedData?
Well, it seems to me both Paul and Ian are right to a strong degree and are essentially arguing over only one thing – the meaning of the term Linked Data.
Paul quote Tim Berners-Lee’s design note on Linked Data:
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
4. Include links to other URIs. so that they can discover more things.
The emphasis is Paul’s. I would emphasise a different point:
4. Include links to other URIs. so that they can discover more things.
And in point four lies the reason that Ian is saying a spreadsheet isn’t Linked Data, even if it’s on the web and even if it’s linked to. The only standard for describing how one resource relates to others using URIs is RDF. Sure, you can put URIs into a spreadsheet, but there is no standard interpretation of what the sheets, rows and columns mean. Sure, you can put URIs into a CSV file, but again, there is no standard interpretation of what the fields mean.
The end result of that is data published on the web that can be linked to but not from.
At this early time, though, Paul argues that what we really want is to get more and more data published and open. We all agree on that, I know. Ian does for sure, he runs Data Incubator for exactly that reason – well, that and helping show those publishing spreadsheets and CSV why they should move to RDF and Linked Data.
In the comments on Paul’s post Justin (another senior manager at Talis) says:
Yes the same mistake was made with the rise of the web.
Once you had URIs and HTTP you already had plain text which is a perfectly good way to encode content. By adopting the STANDARD convention of HTML, all sort of existing text based formats with their various mark ups were locked out. That locked out a lot of content that already existed and required anyone who wanted to play to convert existing content into a html format.
Of course it did have the small side effect that to consume web content you only needed a browser that understood one convention i.e. html.
The same is true of RDF. XML is the equivalent of ascii in this regard.
And that’s the point. XML is the equivalent of ASCII, as is a spreadsheet or a CSV file, not because they’re simple, but because they have no mechanism for embedding the relationships and links necessary to link out from your data. Yes, they can contain URIs and clients can decide to make those into links, but there is no way to describe the meaning.
I agree with both side of this argument – If it isn’t RDF then it isn’t Linked Data, but I wouldn’t keep pushing that point if someone was willing to publish data yet unable or unwilling to publish RDF (in any of its many forms).
8 Comments to Paul Miller is right… and so is Ian Davis
I think the semantics here is problematic. I don’t mean the explicit linking to metadata kind of semantics, but the whole problem with using ambiguous and generic words to describe both precise technological expressions and general trends.
You could create your own term to demonstrate. We could try “Published Data”. Publishing data, some in the public sector seem to content, can be done in PDF’s. But that’s not what anyone in my industry (I also work at Talis) would call “Published Data”, because it’s in an obtuse format which can’t be used without serious investment of time and translation. But it fits the criteria of being available on the web, so it’s data that’s been published… and you can imagine the twitter battles following that.
To me, “Linked Data” makes use of semantic web technologies (including RDF), when it’s used as a proper noun (a noun, usually capitalised in English, which expresses a specific thing or person: like “United Kingdom” or “Tom”).
Tim Berners-Lee would seem to agree, sometimes, from his recent ReadWriteWeb interview: “They [Linked Data and the Semantic Web] fit in completely, in that the linked data actually uses a small slice of all the various technologies that people have put together and standardized for the Semantic Web. … One of the nice things about Linked Data, when they have a pile of it, is that they could run a SPARQL server on it. … So the message [for government] is to use RDF.” However his explicit stance on whether Linked Data NEEDS RDF isn’t crystal clear, it would seem that his expectation is that Linked Data uses SPARQL (which needs RDF).
However, the message to get data out there doesn’t require it to be Linked Data. It still makes most sense to do so, as it’s become best practice and standards-based. So, is arguing whether this data is “Linked” or not useful if it’s best practice to use RDF and SPARQL?
Seems, to me, that the real debate is whether it is best practice to use SPARQL, not whether linked data is Linked Data.
Sorry, should have linked to TBL interview: http://www.readwriteweb.com/archives/interview_with_tim_berners-lee_part_1.php
Surely the critical issue is whether the semantics are available, not whether they are in RDF. If a csv file is published AND suitable semantics are available, then you know which columns are URIs or whatever else.
… but how to give the semantics … maybe someone needs a standard for meta-data … hey what was RDF supposed to be for???
I’m with Alan – if you publish data on the web and a suitable semantics for interpreting that data and linking it to other data, then why isn’t it Linked Data? It just so happens that RDF has a clear(er) semantics describing the interpretation of its data elements (URIs in particular) than a spreadsheet does; it doesn’t mean you couldn’t apply similar semantics to a spreadsheet if you were so inclined.
July 20, 2009
Of course, it’s possible to create a system of machine readable data using CSVs, but how does one get from the CSV to the definition of it? And once one has the definition, it’s only practical to describe the same type of data within one file as the definition has to say something like “column one means the person’s homepage”.
It’s not that it couldn’t, with a lot of work, become Linked Data. But why would you?
There are only two reasons to publish something like CSV, Excel or XML. One is that you already have the data in that form, so publishing is simpler. The other is that it needs to be consumed in a specific context where that format is already easily accepted.
Either of those may be a good reason to publish something that’s not Linked Data, but saying it is isn’t quite true.
Surely RDF is not the “*only* standard for describing how one resource relates to others using URIs.” It might be the only general-purpose abstract standard for that. But there are a couple dozen other standard ways to describe, in certain contexts, how one resource relates to another using URIs.
I mean, even just could be described as that. Obviously it’s a very special purpose case of limited use.
There are probably other standards that neither you nor I have heard of. Of course, one could argue that a standard nobody’s heard of isn’t very useful, and you should use the general purpose abstract standard that people have heard of.
But it seems oddly, yeah, dogmatic,to suggest that there isn’t even possibly (now or in the future?) any standard that allows one to “describe how one resource relates to others using URIs.” Use what works.
It ate my tag in my comment. I suggested that even just using an html link tag with rel equals ‘canonical’ could be described as a very limited specific context for ‘describing how one resource relates to another using uris’.
[...] also liked the simplicity with which Alan Dix and Elliot Smith responded to Rob Styles’ ‘Paul Miller is right… and so is Ian Davis,’ writing; “Surely the critical issue is whether the semantics are available, not whether they [...]
Leave a comment
Additional comments powered by BackType
Search
What I'm Doing...
- @PoppyD I thought it might be. @kiyanwang (a colleague at http://talis.com/) will be at @s4startups too. in reply to PoppyD 14 hrs ago
- I wonder what brings @PoppyD to Birmingham? in reply to PoppyD 15 hrs ago
- RT @loosea: Loving sackboy explains the semantic web :) #talishackday #talis #hackday 3 days ago
- More updates...
Recent Comments
- Arizona Joe on Fixing a plasma TV
- alex_turner11 on Ground roundup of new eReaders at CES on CNN
- negative_charge on Hacking Into Your Account is as Easy as 123456
- infopeep on Hacking Into Your Account is as Easy as 123456
- BenenhaleyBrian on The 18 Mistakes That Kill Startups
- Brian Benenhaley on The 18 Mistakes That Kill Startups
- infopeep on The 18 Mistakes That Kill Startups
- Rob Styles on Ruby Mock Web Server
- Jim on Fixing a plasma TV
- hedgehog on Ruby Mock Web Server
Categories
- .Net Technical (8)
- Blog on Blog (6)
- commands I have issued (9)
- Enterprise Architecture (19)
- event (4)
- Fiction Book Review (2)
- Food (2)
- Intellectual Property (9)
- Interaction Design (27)
- Internet Social Impact (43)
- Internet Technical (16)
- IP Law (10)
- Library Tech (18)
- Music (2)
- New Toy (4)
- Non-Fiction Book Review (7)
- Ontologies (6)
- Open Data (7)
- Other Technical (20)
- Personal (36)
- Random Thought (16)
- Resourcing (4)
- Review (1)
- Security And Privacy (11)
- Semantic Web (30)
- Software Business (10)
- Software Engineering (37)
- Talis Technical (9)
- Uncategorized (44)
- Working at Talis (26)
- [grid::blogpaper] (8)
- [grid::fatherhood] (4)
Archives
- January 2010 (4)
- November 2009 (10)
- October 2009 (4)
- September 2009 (2)
- August 2009 (9)
- July 2009 (12)
- June 2009 (5)
- May 2009 (6)
- April 2009 (7)
- March 2009 (3)
- February 2009 (6)
- January 2009 (10)
- December 2008 (4)
- November 2008 (4)
- October 2008 (9)
- September 2008 (23)
- August 2008 (8)
- July 2008 (1)
- June 2008 (1)
- May 2008 (6)
- April 2008 (14)
- March 2008 (3)
- January 2008 (5)
- December 2007 (6)
- November 2007 (13)
- October 2007 (9)
- July 2007 (2)
- June 2007 (1)
- May 2007 (10)
- April 2007 (5)
- March 2007 (11)
- February 2007 (10)
- January 2007 (13)
- December 2006 (8)
- November 2006 (8)
- September 2006 (2)
- August 2006 (1)
- June 2006 (2)
- February 2006 (2)
- January 2006 (3)
- December 2005 (3)
- November 2005 (2)
- September 2005 (2)
- August 2005 (5)
- July 2005 (8)
- June 2005 (3)
- May 2005 (2)
- February 2005 (1)
- January 2005 (4)
- December 2004 (3)
- November 2004 (6)
- October 2004 (2)
- September 2004 (2)
- August 2004 (5)
- July 2004 (1)
- June 2004 (4)
- May 2004 (4)
- April 2004 (3)
- March 2004 (13)
- February 2004 (6)
- December 2003 (3)
- November 2003 (1)
- August 2003 (2)
- July 2003 (1)
- June 2003 (2)
- May 2003 (1)
- March 2003 (1)
- January 2003 (1)
- October 2002 (1)
- May 2002 (1)
- March 2002 (1)
- August 2001 (1)
- May 2001 (1)
- April 2001 (1)
- January 2001 (1)
- December 2000 (1)
- November 2000 (1)
- December 1999 (1)
- November 1999 (1)
- July 1999 (1)
July 20, 2009