Panlibus Blog

Archive for December, 2006

‘Open Data is not the point’? Oh yes it is.

270075993 Be5C6D5D0B M

Over on One Big Library, Dan Chudnov provides a thoughtful post in response to our Open Data podcast. Although notionally on holiday, Richard beat me to a response this morning… and set my stomach rumbling with that picture. I won’t go back over the points that Richard has already covered, but a couple of areas seem to merit a separate response;

“If what the gushing hints at – and I tend to hear it as something similar to what I mean by the ‘bibliographic backplane,’ linking everything we do and own and share up seamlessly and having it all grow in response to our use of the system as participatory feedback and all of that web 2.0 magic – it will be because our society moved in a direction whereby we chose to build a bibliographic backplane. And because that move coincided, eventually, with all of these other factors.”

The notion of a ‘bibliographic backplane’ is a powerful one, and goes a long way toward understanding why Casey’s announcement is an important part of an important trend. Whilst there is value to those libraries that already buy (or otherwise acquire) Library of Congress data and subscribe to various OCLC services in order to ease the flow of their bibliographic workflows, for many of them this value may not be readily apparent as the payments for these services simply disappear from some ring-fenced account every year as part of their ongoing operational budget. Whether the big pool of data upon which they draw is paid for by some other arm of their organisation or acquired Openly matters little to many of those actually putting the data to work in their catalogue today. That is not to say that they don’t want open data, or that open data is not a better way to proceed. It simply recognises the reality that changing the financial model doesn’t immediately impact in their world.

Casey’s premise is that smaller institutions should be entitled to access bibliographic data – and software – previously beyond their means, and it is this that he sees as the important issue here. Good luck to him on that.

Returning, then, to Dan’s bibliographic backplane, I personally see this potential set of use cases as the game-changing ones. Bibliographic data powers library catalogues around the world. Those in the know fire arcana at ‘Z targets’ and ‘SRU interfaces’. Everyone else maybe looks up their local catalogue from time to time, to discover whether the book they want is available. Library data – to all intents and purposes – is not ‘of the web’. Outside of a select group, integration of library information into the conversation is reduced to the level of pointing people at a catalogue and suggesting that they look a book up. We all do it. We all, when referencing an item likely to be held in a library, tend instead to point those with whom we are speaking to Amazon. We do this because it’s easy, and because of the added value to be found there, above and beyond the basic bibliographic description. We do this. Why not do this, and leverage information from libraries that hangs off the basic unit of bibliographic description to offer any number of user-facing orchestrations like this one (just look at that URL…)? Why not offer a basic uri-accessible record via something like the Platform, making it easier for everyone else to talk about it, to work with it, and to add value to it, either separately or (better?) together?

We have a vast quantity of high quality data. We’re wasting it. Let’s stop. Opening up access to the data allows existing stakeholders to do some interesting things. It also creates opportunities for whole new groups of beneficiaries to do things we haven’t thought of yet. Dan suggests that this needs a move on the part of society. It absolutely does, and that move is already happening. It also requires the technical shifts that we’re seeing, and an act of will on the part of the gatekeepers. The Library of Congress, although the focus of much of the discussion because it is from them that Casey is obtaining his data, is far from the worst offender here. Their data are already reasonably accessible, especially if you reside in the United States. It is elsewhere in the sector that we see a confusing mass of relationships, agreements, contracts and obfuscation causing all too many to draw back from participation and sharing. We could sit back and do nothing. We could set up a raft of committees and working groups to pick through the issues. Or we could take a leaf out of Alexander’s book as I more or less suggested in the closing minutes of the podcast.

Dan concludes;

“If you’ve read all of this, well then, here’s the good part: I’ve got this satellite imagery here that’ll fundamentally change the way you look at maps and how all of us see the world, and I’ll sell it to you, CHEAP…”

Isn’t that (almost) the point? Map data has been around for a very long time. Satellite imagery has been around for quite a long time. Even in countries like the US, where federally funded data were notionally free, third parties were heavily involved in adding value to the basic cartographic units. For the explosion in usage which we see today, it required a change in attitude and the involvement of Google et al. We went from essentially lousy but free or better but costly map data and ridiculously expensive satellite imagery to an explosion in the use of both, essentially free at the point of use.

Dan is shown, left, next to Mike Giarlo in this picture by David Fiander. The picture is available from Flickr under a Creative Commons Attribution Non-commercial No Derivatives licence.

Technorati Tags: , ,

Let them eat cake

The famous phrase “Let them eat cakeattributed to Marie-Antoinette came to mind when reading dchud’s One Big Library posting Open Data is not the point. The phrase that conjured up my though was this one:

LC bibliographic data is not exactly being held captive. Anybody can go buy a copy of this data now right from LC or from third parties today. The cost of this data is not in any way prohibitive for a medium- to large-scale institution that is already used to doing Big Deals in the six and seven figures.

So if you are not a medium- to large-scale institution that is already used to doing Big Deals in the six and seven figures.?

To me that seems a little bit of a barrier to entry for those outside that exclusive club. OK as pointed out you can freely download individual records from the Library of Congress’ Z39.30 and SRU servers – fine if thats all you want, but no good if you want to mine or analyze the data – or put it to use in wholly new ways outside the boring ILS.

The key benefits of opening up your data are usually unexpected, things like Liveplasma appearing on the back of Amazon’s data.

In a comment to his post he clarified his thoughts a little – this isn’t some landmark tipping point – people getting excited about this maybe being some kind of tipping point are framing the discussion myopically. The open distribution of the LC bib, and hopefully authority, records is not a tipping point.

The opening up of this data is the start of a process that could lead to a tipping point. Once the LC data is open and freely access it will become more and more untenable for others to restrict access to their equivalent silos of similar data. Once it becomes the norm to be able to access and aggregate data from many of the national libraries and similar institutions, we will then hit a tipping point. A tipping point which will bring a bibliographic backplane closer to realization.

Some could see the recent discussion as criticising the hard work of cataloguers over decades. Far from it. The Library of Congress, OCLC, Talis, and librarians around the world have delivered real and lasting value with their careful nurturing of a rich legacy of inter-connecting catalogue records. It’s just that with a bit of brave and lateral thinking (and action), changes in attitudes and technologies make it possible for us to truly realise the potential of all that data; by setting it free for ourselves and others to use and to build upon.

(Pan photo taken by hfb displayed in Flickr)

Technorati Tags: , , ,

An Interview with Casey Bission

EducauseConnect is carrying a 13 minute podcast interview with Mellon Award winner Casey Bission – Plymouth State University Information Architect, recorded at the CNI Fall Task Force Meeting.

An interesting listen giving an insight into his thinking around software for the use of students, Web 2.0, and the Mellon Award, but no mention of the use of the award money to purchase and then openly distribute Library of Congress catalog records. Something that has attracted much attention and was covered in the latest Library 2.0 Gang podcast.

Casey is being very quiet on this – not having second thoughts I hope.

Technorati Tags: , , , ,

Data – are the chains beginning to break?

79300722 B181C6Cb78

Earlier this week, Richard blogged about Casey Bisson‘s award from the Mellon, joining Tim Spalding, myself and others in speculating about the announcement that Casey intends to use his award to purchase and distribute bibliographic data from the Library of Congress.

On Wednesday night, I sat down with Richard, Tim, Ross Singer and Rob Styles to have a chat about some of the implications, and the resulting conversation is now available as a podcast for your listening pleasure.



Listen Now or Download MP3 [57 mins, 39 Mb]

Casey’s doing some great work, but to a degree the purchase and dissemination of this data could be seen as simply working within the existing system. The existing system is actually fundamentally flawed, and needs some concerted pressure to ‘persuade’ the incumbents to change.

With a commercial model dependent to some degree upon revenue from our UK-focussed Talis Base and UnityWeb services, Talis could be seen as one of those incumbents with a pretty substantial vested interest in the status quo. However, we recognised that things had to change, and set about doing something about it. When the UnityWeb contract came up for renewal last year, we quite deliberately stated our intention not to re-tender. Instead, we offered Source. We swept away the old model of subscription access into a closed club, and instead offered an indication of what was possible if you were prepared to think differently. Free contribution of data by anyone who wants to share it. Free access to that data, governed by the terms of a flexible and permissive license designed to protect the rights of the contributing library rather than Talis.

And even Source is an intermediate step – a small move in the right direction whilst remaining sufficiently familiar that there is zero pain for those users of UnityWeb making the transition. The Platform with which the data for Source are shared is capable of so much more, as it consumes large bodies of data, indexes them rapidly, robustly and scalably, and offers those data up for use, reuse, and orchestration via a suite of enterprise-strength web services. Project Cenote shows yet another view onto those same data, rapidly assembled to demonstrate the ease with which an ‘OPAC’ might be placed atop the Platform by staff at Talis… or by anyone else who wants to. Richard’s work with Greasemonkey and browser extensions, too, leverages the same power. Share data – once – with the Platform. Then consume it in your own applications, see it in Cenote, see it in Google, see it in Aquabrowser, see it in Amazon. See it sliced, diced and grouped in ways that make sense to your users rather than just to you. See it anywhere you want it seen. Easily, Reliably, Affordably, Repeatably.

These ideas can clearly be seen to work. They work even better, for even more institutions, when the underlying catalogue data held by those institutions around the world can be exposed by them and shared with a common set of infrastructural services such as those found on the Platform.

Whose interests are served by obstructing the free movement of these data? On its own, the value of an individual catalogue record is surely low. The value lies in what you do with these records in aggregate, not in the string of numbers and letters comprising the record itself.

Whose interests are served by obfuscating the picture with contracts and legalese of questionable morality… and even more questionable enforcability?

Roll on 2007 – the Year we all wake up and put our data to work? Put it in the Platform. Put it on your site. Do what you like. Just let it out of its box and invite people to do something amazing, something interesting, or just something a little bit useful to one particular individual. At Talis, we’re committed to Open Data, and we’re hard at work to ensure that the data for which we are the custodians is opened up and shared with all of you.

‘cadena rota’ image shared on Flickr by trackrecord, with a Creative Commons Attribution Non-Commercial No Derivatives licence.

Technorati Tags: , , , , , , , , , , , , , ,

Interview with a Delicious Monster

With Delicious Monster 2 due for release some time soon, the brain behind it, Wil Shipley, is interviewed by Infinite Loop.

IL: Tell us about what new features we can expect in Delicious Library 2.

WS: Well, without making any explicit promises, what we’ve got so far is allowing MUCH, MUCH larger collections (thanks to using a real SQL database as our file format, instead of XML), much snazzier and snappier graphics, smart shelves, a couple other features we haven’t announced yet, and a ton of little bug fixes and tiny features that just make the app feel sweeter. Plus, a new interface, to match what Apple’s doing with its latest iApps.

One big Library

Over on the Talis Forums there has been a discussion thread running for the last few months entitled Dream OPAC. In a forum area inhabited mainly by users of Talis Systems it is inevitable that some of the wishes expressed are slanted towards Talis products, but some of the wishes definitely express the desire for a better user interface for the library user of systems from any vendor, or none.

“Go on. tempt me – I want to see images and video, I want to hear sound clips, I want colour and movement and interaction. I want to read what other readers think about what they’ve read and I want to be able to contribute my own opinion. And when I find a mistake in the catalogue I want to be able to point it out – and perhaps I want to add information too. Our users are an intelligent lot (well, some of them are) so why not get them involved? Prism is so dull and boring. Take it or leave it – what would you choose to do?

“The interface needs to be much more customisable – we should be able to drop the search function into any html page, and have complete control over the formatting of search results and full record displays.

“Books on the same shelf –
To browse a shelfmark index. This might seem like going backwards in functionality, as we already have a same subject class-based search which is more comprehensive (and probably not much used), but I wonder if this might not be more intuitive to grasp and prove more popular – perhaps we could link from it to shelves with the same class elsewhere in the library. Naturally we might get some flak from readers who complain that the books aren’t actually on the shelf where they are supposed to be (not that that would happen in our library of course).

On the subject of the wishes of librarians, take a look at NGC4Lib (Next Generation Catalogs for Libraries) mailing list – archives here. There is a fascinating debate going around many aspects of where the LMS/ILS should be heading. Today Amy Ostrom started a new thread with the wonderful name of ‘Laundry list for NGC (long post) ‘. Her opening paragraph is refreshingly pragmatic:

I have not been able to keep up with all the posts, but it seems no one will just create a substantial list – too much theory and questioning/doubt behind everything. I don’t know about anyone else, but I am myself an end user, and I have a LOT of things I would love to see. I don’t care if it is done in collaboration with Amazon, or Worldcat, or any organization, but this is what I want. I hope this proves beneficial. (Apologies in advance for a long post.)

Amy has a good point, there are far too many email and blog inches wasted in discussing which particular consortium/interest group/vendor/service should be the prime contractor/controller for delivering the library of the future – or is Open Source the silver bullet. Time will probably tell that the final solution(s) will be a combination of all of the above. The sooner everybody stops trying to protect their own special interest group and seriously start cooperating – the sooner it will happen. Quite frankly the end users don’t care – they just want libraries to start competing with the rest of the web, and soon. Amy’s is sure is a long post, well worth joining the list to read.

The following comment later in the thread, from Gail Richardson caught my eye

I hate that each library does whatever slightly differently. I think it would be great if I could just go to one giant libary site, search and discover, then finally link to my local library where I can place a hold and pick it up.

Too right Gail. Why should you have to use, and by definition have to get the hang of, a new interface just because you are searching a different library? It would be like Google having a different user interface experience for each City.

I was about to respond to her comments, when my colleague Rob Styles beat me to it:

Here at Talis we have a service called Source for ILL Librarians that is free to contribute to and free to discover holdings on. The platform services that power this are also powering the freely available Cenote. Cenote links through to libraries wherever we have the information to and we’ve based that on an open directory that anyone can use and help maintain. These things aren’t just possible; there are some folks here who want to help you do it.

Source is a UK service, but given today’s technologies, the web and other advancements there is absolutely no reason you shouldn’t get what you want globally. We just have to change the way we think about this stuff; open up, stop pretending we can “own” the data that everyone needs to share and, above all, build on a cost model more like the one Google started with.

Couldn’t have said it better myself.

(Photo taken by Heaven`s Gate (John) displayed in Flickr)

Technorati Tags: , , , , , ,

A revolutionary Bisson

Casey Bisson of the Lamson Library at Plymouth State University, and occasional contributor to the Library 2.0 Gang series of Podcasts on Talking with Talis, has won the prestigious Mellon Award for Technology Collaboration.

Apart from posting the press release announcing the award on his blog last week, Casey has been quiet about the matter – although the title of that posting Woot! Woot! gives me the impression he is quite pleased about it.

The award was presented for his ground-breaking software application known as WPopac. WPopacIt’s an OPAC – a library catalog, for my readers outside libraries – inside the framework of WordPress the hugely popular blog management application.” – A mashup between OPAC functionality and a blogging platform that has been an example I have been using in my Library 2.0 presentations for several months

Casey may have been a bit quiet on his award, but others have not been so quite on the subject. Buried in one of those blogs is the following tantalizing morsel of information:

The revolutionary part of the announcement, however, was that Plymouth State University would use the $50,000 to purchase Library of Congress catalog records and redistribute them free under a Creative Commons Share-Alike license or GNU. OCLC has been the source for catalog records for libraries, and its license restrictions do not permit reuse or distribution. However, catalog records have been shared via Z39.50 for several years without incident.

Run that past me again – purchase Library of Congress catalog records and redistribute them free – now that’s a radical step. A step that initially almost snuck under the library blogosphere radar. Others, such as Tim Spalding have now picked up on this – as he says ‘So, three cheers for Casey, Mellon and “free as in freedom”! I can’t wait to see where this all leads.

There has been much discussion on many mailing lists about freely distributing catalogue records, and maybe using them to seed a collaborative cataloging exercise.

This announcement could be the beginnings of a major shift in the way such records are obtained, modified, added to, distributed and used. I would quibble with the licensing that is being suggested – Creative Commons Share-Alike license or GNU. I suspect that detailed analysis will show that neither of these are ideal for sets of bibliographic records. Looking at the available open licensing possibilities in this area led to us [at Talis] proposing, and offering to the community for discussion, the Talis Community License, which may well be [or be the basis for] a better solution in this area.

.

Another part of the announcement I would be interested in a clarification for, is the way these records are to be made available – will it be via WPopac only, or does Casey have a Search/Web service in mind.

I won’t be the only one who will be watching this with great interest – maybe the police car behind Casey and his sign in the picture above had more meaning than was initially apparent when Jenny took it a few months back ;-)

(Photo taken by The Shifted Librarian displayed in Flickr)

Technorati Tags: , , , ,

John Battelle ponders a changing landscape

John Battelle, author of The Search, offers an interesting analysis of recent wobbles in Big Media land, attempting to understand the realities that led to a number of high profile departures from giants such as TimeWarner, NewsCorp and CBS.

After some analysis, John proposes that;

“There are two major forms of media these days. There is Packaged Goods Media, in which ‘content’ is produced and packaged, then sent through traditional distribution channels like cable, newsstand, mail, and even the Internet. Remember when nearly every major media mogul claimed that the Internet was simply one more media distribution channel? They were right, but only in so far as it pertains to Packaged Goods Media. Over the past few decades, massive media conglomerates have built on the deep DNA of Packaged Goods Media.

The second major form of media, is far newer, and far less established. I’ve come to call it Conversational Media, though I also like to call it Performance Media. This is the kind of media that has been labeled, somewhat hastily and often derisively, as ‘User Generated Content,’ ‘Social Media,’ or ‘Consumer Content.’ And while the major media companies are unparalleled when it comes to running companies that live in the Packaged Goods Media world, running major companies in the Conversational Media field require quite a different set of skills, and consideration of radically different economic and business models – models which, to be perfectly frank, conflict directly with the models which support and protect Packaged Goods Media-based companies.

It seems clear to me that the folks now charged with running the interactive assets of NBC, Viacom, Time Warner, and Newscorp – four of the largest Packaged Goods media companies in the world – are charged not only with growing their own Conversational Media assets, but also with protecting the Packaged Goods Media assets of their bosses. And those assets are based on several heretofore unassailable pillars:

1. Ownership or control of Intellectual Property by the corporation.

2. Ownership or control of expensive distribution networks.

3. Established business models based on highly evolved approaches to advertising and subscription models.

Each of these three pillars – and I may stumble upon others as I keep thinking out loud – seem to be either irrelevant or significantly shifted in the world of Conversational Media.”

Some of the words are used differently, and some of the emphasis may need tweaking to wholly fit, but much of this argument applies to the e-problems with which we see libraries grappling today, too. The business models simply don’t work anymore, but hopefully we don’t need change as dramatic as that seen at these big commercial organisations before libraries are able to adapt and grow.

In the comments, Fred Von Lohman (twice!) or Gary Price (clearly John’s having problems with his comment software…) make important points, citing that much-loved occupant of the Talis book shelf, The Innovator’s Dilemma;

“For important clues regarding the answers to the questions you raise, I’d encourage you to (re?)read Clayton Christensen’s best-sellers The Innovator’s Dilemma (perhaps better yet, his follow-on, The Innovator’s Solution).

These books make a compelling case that industry incumbents generally cannot effectively adopt disruptive innovations (defined as innovations that either cater to new markets or the low end of existing markets). This stems from the institutional incentives that incumbent companies face, suggesting that swapping executives is not an effective solution.

The only effective solution? Create separate entities that are entitled to compete with the parent – something none of these media companies has been prepared to do yet.

The rise of Conversational Media fits almost all the characteristics of a disruptive innovation as defined in Christensen’s research. I think that means Google is going to eat their lunch.”

What do the rest of you think? Do you see the parallels, and do you see the way forward?

Technorati Tags: , , ,

Services to depend on

Tony Hurst over on the OUseful Info blog the other day, gave an update on his progress developing OU Library Traveler – the Greasemonkey plug-in which he entered in to the Mashing Up The Library Competition.

The Traveler is really taking shape, with title and author lookups having been added and recently links to ebook resources have also been built in.

Following discussions with the Library, the functionality has been tightened up somewhat and a greater focus placed on delivering OU Library services, with the result that things like a Google Books lookup are unlikely to appear in the official release.

Its great to see, what would a few months back may probably been called a toy, heading towards an official release. – Power to your elbow Tony!

Later in his post, postulating how the Traveler could be developed to enable the Open University’s distributed community identify books in academic libraries [who offer lending rights to OU students] local to them, Tony says the following:

The Talis Platform project is one tool we could start using, but as with using all ‘free’ third party services/APIs: how do we know that a) the data will be kept up to date?; and b) how do we know the service will hang around for the next few years?

Taking a baby steps approach, what I’d quite like to see is a script that will add links identified via Talis Platform to results pages like this one that in turn lead directly to the corresponding catalogue; or, even better, buttons that will let the user search the corresponding catalogue from a single search box at the top of the page…

How do we know that the data will be kept up to date, and will hang around for the next few years? – Understandable concerns. How many academic projects have bloomed, delivered useful data, then either run out of money/hit political problems, and disappeared? How many directories of Z39.50 targets have been set-up by enthusiastic groups only to grow stale over time as the members of the group loose enthusiasm or move on to other things? How many people are concerned about the currency and life expectancy of the data and services provided by Amazon Web Services?

I suspect that the answer to the Amazon question may be coloured by issues around having to route users to Amazon’s sites, to satisfy their licensing conditions, but not by the dependability of the service. Why is that?

To use that particularly distasteful but appropriate bit of current business-speak, Amazon eat their own dog food. If the services providing Amazon Web Services go down, Amazon goes down. It is a commercial imperative for Amazon to keep the engines running

So why would services/APIs provided by the Talis Platform be worth depending upon? – because we have built services upon our own Platform and are continuing to build more. Talis Source is service providing free resource contribution and discovery to the Interlending Community along with optional low-cost request management. Since its launch last spring the all measures of the size of Source have grown dramatically. Talis Source is built upon Talis Platform Services, all contributions that appear in Source are contributed to the Platform.

Project Cenote uses platform services, so does our own Greasemonkey plug-ins Amazon@Libraries and LibraryThingThing. These are just a few obvious examples of where it is an imperative for Talis to keep the Platform services current, scalable, and available. These will be joined soon by many other examples both large and small. We and others are starting to build more and more on the Platform.

As with the adoption of any new technology/service/business model there is going to be some reticence from many before trying it, but it is also true that the early adopters often gain the much by being such

Finally to address the concerns about data and commerce, by mentioning Amazon as an example, I draw your attention to the Talis Community License which is designed to preserve access to data freely contributed.

Technorati Tags: , , , , , ,

And the Answer was/is ….

28th November – a door closes “Adieu to Google Answers” the Official Google Blog bids farewell to Google Answers – as Lazytechie says:

Google has shut down their google answers service .But i wonder why did they shut down their service .May be it might not be very popular ,but it was very informative , had quality and google brand name.In my opnion they should have convert it to free service .

Predictably the Yahoo! Answers team invite the former Google Researchers to join them.

Back at the end of September Microsoft opened up their beta equivalent service Windows Live QnA to the public. They seem to be putting more of an emphasis on the community side of answering questions. QnA community members have fun and help each other by asking, answering and voting on questions. – democracy hits the field of providing answers to questions that searching cannot satisfy!

So as one paid for service closes another free and community driven alternative joins the field – time will tell who got it right.

Being from the library world in the UK, I wouldn’t be doing my duty if I didn’t reference Enquire the collaborative “a live question and answer service available 24 hours a day, every day” service which has been successfully run by the MLA and about 100 UK public libraries for several years