SKOS, Linked Data and LCSH!

The inimitable Ed Summers has been working inside the Library of Congress, building examples and demonstrators of how LC could be getting themselves into the semantic web, the linked-data web.

It appears he’s got fed up of waiting for the support, permission and infrastructure he so richly deserves to get this data out there and he’s been and gone and done something smart outside.

lcsh.info is now a home where you can find a copy of the Library of Congress Subject Headings available in SKOS.

This is a great piece of work and fits in perfectly with the work I’ve been doing on Semantic Marc.

After much discussion with Ed he’s provided two URI schemes, the primary scheme is based on the LC Control Number, and the second is based on the natural language term of the heading.

So, the LCSH/SKOS URIs for Beer (a subject close to my heart) are:

http://lcsh.info/label/Beer which currently redirects to http://lcsh.info/sh85012832

The concept URIs then do content negotiation to return either RDF or HTML representations.

The URIs based on the natural language term is something I’ve bent Ed’s ear about constantly, mainly because of the way it makes it possible to link bibliographic data into the LCSH data without the need for a lookup, so I’m chuffed to see it. However, what I badgered Ed for was wrong.

After a long discussion with Tom Heath about stuff I now understand why my suggestion to Ed to simply redirect from the term to the LCCN based URI was wrong – using a redirect basically hides the relationship between the term and its control number form the data layer, leaving the meaning implicit in the HTTP conversation.

What Tom suggested, and I hope edsu can do is to provide a response to the term URI that explains its relationship with the LCCN URI.

Great work Ed.

6 thoughts on “SKOS, Linked Data and LCSH!

  1. I still don’t think I understand why it’s wrong.

    A given authorized heading in the LCSH file can correspond to one and only one LCSH authority file. (Granted sometimes a random string will refer to NONE, but if it is an LCSH preferred heading, it can correspond to only one, yes?)

    If this is right, then I think those preferred terms ARE essentially identifiers. If flawed ones. They are alternate identifiers. As evidenced by the fact that most of our legacy systems get by by _treating_ them as identifiers. The way you look up all records attached to a given subject is only by that authorized term. That IS the legacy identifier. So I think you were exactly right in your motivation to want ed’s system to respond to those identifiers.

    So if the preferred heading is an alternate identifier for the very same LCSH record–what’s wrong with a redirect? You certainly wouldn’t want the redirect for any “lead in” terms, these are not identifiers, they’re just synonyms. But the authorized term? I say it’s an identifier for the record. The LCSH term is too. They are both are. It’s not unusual to have more than one identifier for the same object, is it?

    To the extent that ed’s system really prefers you use one to the other–I think that preferences IS expressed by the fact that the authorized heading identifier does an HTTP redirect, but the LCCN identifier does not.

  2. This is a bit over my head, but if I was guessing, I’d say Rob is probably suggesting that the server explicitly say that the redirect URI is an alternate identifier. In RDF proper, for example, you’d do that using owl:sameAs.

  3. Thanks for the shout out. I did get around to taking Tom’s suggestion to provide a lookup technique that preserves the URI for concepts.

    So instead of: http://lcsh.info/label/Beer followed by a redirect there is now a solr powered search http://lcsh.info/search?q=Beer Someday there will prolly be a SPARQL endpoint, but not just yet I don’t think. I’d actually like to see what the data looks like in a Talis store :-)

    You ought to be able to use content negotiation to fetch application/rdf+xml and application/json for search results if you want. The HTML results are actually RDFa.

    Also, I’ve switched over to using hash URIs instead of slash URIs for identifying concepts. So no more 303 redirection. I don’t think I fully understood how hash and slash options could be effectively combined until I read the latest Cool URIs for the Semantic Web.

    Anyhow, keep up the good work!

  4. hanks for the shout out. I did get around to taking Tom’s suggestion to provide a lookup technique that preserves the URI for concepts.

    So instead of: http://lcsh.info/label/Beer followed by a redirect there is now a solr powered search http://lcsh.info/search?q=Beer Someday there will prolly be a SPARQL endpoint, but not just yet I don’t think. I’d actually like to see what the data looks like in a Talis store :-)

    You ought to be able to use content negotiation to fetch application/rdf+xml and application/json for search results if you want. The HTML results are actually RDFa.

    Also, I’ve switched over to using hash URIs instead of slash URIs for identifying concepts. So no more 303 redirection. I don’t think I fully understood how hash and slash options could be effectively combined until I read the latest Cool URIs for the Semantic Web.

    Anyhow, keep up the good work!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>