Back in December I was very critical of the Library of Congress for forcing the take down of the Linked Data service at lcsh.info. LoC employee, and Talking with Talis Interviewee, Ed Summers had created a powerful and useful demonstration of how applying Linked Data principles to a LoC dataset such as the Library of Congress Subject Headings could deliver an open asset to add value to other systems. Very rapidly after it’s initial release another Talking with Talis interviewee Martin Malmsten, from the Royal Library of Sweden, almost immediately made use of the links to the LCSH data. Ed was asked to take the service down, ahead of the LoC releasing their own equivalent in the future.
I still wonder at the LoC approach to this, but that is all water under the bridge now, as they have now launched their service, under the snappy title of “Authorities & Vocabularies” at http://id.loc.gov/authorities/.
The Library of Congress Authorities and Vocabularies service enables both humans and machines to programmatically access authority data at the Library of Congress via URIs.
The first release under this banner is the aforementioned Library of Congress Subject Headings.
As well as delivering access to the information via a Linked Data service, they also provide a search interface, and a ‘visualization’ via which you can see the relationship between terms, both broader and narrower, that are held in the data.
To quote Jonathan Rochkind “id.loc.gov is AWESOME”:
Not only is it the first (so far as I know) online free search and browse of LCSH (with in fact a BETTER interace than the proprietary for-pay online alternative I’m aware of).
But it also gives you access to the data itself via BOTH a bulk download AND some limited machine-readable APIs. (RSS feeds for a simple keyword query; easy lookup of metadata about a known-item LCSH term, when you know the authority number; I don’t think there’s a SPARQL endpoint? Yet?).
On the surface, to those not yet bought in to the potential of Linked Data, and especially Linked Open Data, this may seem like an interesting but not necessarily massive leap forward. I believe that what underpins the fairly simple functional user interface they provide will gradually become core to bibliographic data becoming a first-class citizen in the web of data.
Overnight this uri ‘http://id.loc.gov/authorities/sh85042531’ has now become the globally available, machine and human readable, reliable source for the description for the subject heading of ‘Elephants’ containing links to its related terms (in a way that both machines and humans can navigate). This means that system developers and integrators can rely upon that link to represent a concept, not necessarily the way they want to [locally] describe it. This should facilitate the ability for disparate systems and services to simply share concepts and therefore understanding – one of the basic principles behind the Semantic Web.
This move by the LoC has two aspects to it that should make it a success. The first one is technical. Adopting the approach, standards, and conventions promoted by the Linked Data community ensures a ready made developer community to use and spread the word about it. The second, one is openness. Anyone and everyone will not have to think ”is it OK to use this stuff” before taking advantage of this valuable asset. Many in the bibliographic community, who seem to spend far too much time on licensing and logins, should watch and learn from this.
A bit of a bumpy ride to get here but nevertheless a great initiative from the LoC that should be welcomed. On that I hope they and many others will build upon in many ways. – Bring on the innovation that this will encourage.
Image from the Library of Congress Flickr photostream.