‘Open Data is not the point’? Oh yes it is.
Over on One Big Library, Dan Chudnov provides a thoughtful post in response to our Open Data podcast. Although notionally on holiday, Richard beat me to a response this morning… and set my stomach rumbling with that picture. I won’t go back over the points that Richard has already covered, but a couple of areas seem to merit a separate response;
“If what the gushing hints at – and I tend to hear it as something similar to what I mean by the ‘bibliographic backplane,’ linking everything we do and own and share up seamlessly and having it all grow in response to our use of the system as participatory feedback and all of that web 2.0 magic – it will be because our society moved in a direction whereby we chose to build a bibliographic backplane. And because that move coincided, eventually, with all of these other factors.”
The notion of a ‘bibliographic backplane’ is a powerful one, and goes a long way toward understanding why Casey’s announcement is an important part of an important trend. Whilst there is value to those libraries that already buy (or otherwise acquire) Library of Congress data and subscribe to various OCLC services in order to ease the flow of their bibliographic workflows, for many of them this value may not be readily apparent as the payments for these services simply disappear from some ring-fenced account every year as part of their ongoing operational budget. Whether the big pool of data upon which they draw is paid for by some other arm of their organisation or acquired Openly matters little to many of those actually putting the data to work in their catalogue today. That is not to say that they don’t want open data, or that open data is not a better way to proceed. It simply recognises the reality that changing the financial model doesn’t immediately impact in their world.
Casey’s premise is that smaller institutions should be entitled to access bibliographic data – and software – previously beyond their means, and it is this that he sees as the important issue here. Good luck to him on that.
Returning, then, to Dan’s bibliographic backplane, I personally see this potential set of use cases as the game-changing ones. Bibliographic data powers library catalogues around the world. Those in the know fire arcana at ‘Z targets’ and ‘SRU interfaces’. Everyone else maybe looks up their local catalogue from time to time, to discover whether the book they want is available. Library data – to all intents and purposes – is not ‘of the web’. Outside of a select group, integration of library information into the conversation is reduced to the level of pointing people at a catalogue and suggesting that they look a book up. We all do it. We all, when referencing an item likely to be held in a library, tend instead to point those with whom we are speaking to Amazon. We do this because it’s easy, and because of the added value to be found there, above and beyond the basic bibliographic description. We do this. Why not do this, and leverage information from libraries that hangs off the basic unit of bibliographic description to offer any number of user-facing orchestrations like this one (just look at that URL…)? Why not offer a basic uri-accessible record via something like the Platform, making it easier for everyone else to talk about it, to work with it, and to add value to it, either separately or (better?) together?
We have a vast quantity of high quality data. We’re wasting it. Let’s stop. Opening up access to the data allows existing stakeholders to do some interesting things. It also creates opportunities for whole new groups of beneficiaries to do things we haven’t thought of yet. Dan suggests that this needs a move on the part of society. It absolutely does, and that move is already happening. It also requires the technical shifts that we’re seeing, and an act of will on the part of the gatekeepers. The Library of Congress, although the focus of much of the discussion because it is from them that Casey is obtaining his data, is far from the worst offender here. Their data are already reasonably accessible, especially if you reside in the United States. It is elsewhere in the sector that we see a confusing mass of relationships, agreements, contracts and obfuscation causing all too many to draw back from participation and sharing. We could sit back and do nothing. We could set up a raft of committees and working groups to pick through the issues. Or we could take a leaf out of Alexander’s book as I more or less suggested in the closing minutes of the podcast.
Dan concludes;
“If you’ve read all of this, well then, here’s the good part: I’ve got this satellite imagery here that’ll fundamentally change the way you look at maps and how all of us see the world, and I’ll sell it to you, CHEAP…”
Isn’t that (almost) the point? Map data has been around for a very long time. Satellite imagery has been around for quite a long time. Even in countries like the US, where federally funded data were notionally free, third parties were heavily involved in adding value to the basic cartographic units. For the explosion in usage which we see today, it required a change in attitude and the involvement of Google et al. We went from essentially lousy but free or better but costly map data and ridiculously expensive satellite imagery to an explosion in the use of both, essentially free at the point of use.
Dan is shown, left, next to Mike Giarlo in this picture by David Fiander. The picture is available from Flickr under a Creative Commons Attribution Non-commercial No Derivatives licence.
Technorati Tags: open data, Talis, Talking with Talis


Recent Comments