Panlibus Blog

Archive for July, 2006

A randomly great source of ideas

Dave Pattern of The University of Huddersfield as come up with a great tool for generating new Library 2.0 ideas

The Library 2.0 Idea Generator

If you have a few spare moments, pop over there and generate a few.

Some are more extreme than others, as the image shows. But nevertheless proof that there is more in this Library 2.0 thing than initially meets the eye!

Technorati Tags:

Draft JISC strategy out for comment

The Joint Information Systems Committee (JISC), historically the driver of so much innovation in the UK’s engagement with digital libraries, e-learning, and the like, has released a draft of their 2007-09 strategy for comment.

Comments are due by 25 September, and could play a role in shaping the way in which hundreds of millions of pounds are spent over the next three years.

Like so many other organisations, JISC increasingly recognises that we operate in a global environment, so don’t let being on the other side of the water put you off responding…

Technorati Tags: , , , ,

A great opportunity to understand social tagging ?

Listen to the Library 2.0 Gang

Speaking on last night’s Library 2.0 Gang podcast, LibraryThing’s Tim Spalding made a great offer, at which I’m sure people would love to jump.

We were talking about folksonomies and tagging, and LibraryThing is an example of a site in the library space where tagging has really taken off; Tim reported that there are more than 6,000,000 (non-unique) tags in his system.

Tim pointed out that analysis – even from Library & Information departments at universities – tends to focus upon sites such as Flickr. He’s willing to facilitate research into the rather different way in which LibraryThing members apply tags to their books, and there would presumably be plenty of scope to compare this to the formal ‘tags’ applied to items by cataloguers.

It sounds like an interesting opportunity, and I for one look forward to reading the results.

Technorati Tags: , , , , , ,

Things work differently in the new world

PodTech’s Robert Scoble had an interesting post over the weekend, in which he identifies some of “the common things… developers all need” in the new world we are building.

“They need a freaking fast distribution platform. Er, a set of server farms around the world. Why? Well if that little Internet component that Chandu’s thinking of slows down my blog I’m going to get rid of it. And so will every other user around the world. Delivery speed is job #1 in this new world. It better work in London, Chennai, Tokyo, Shanghai, and Cape Town, the same way it does in San Francisco.

They need a shitload of storage space. Yes, that’s a technical term. :-) You try crawling 100 million blogs and see what kind of index it builds for you. Let’s just round up to ”a terabyte.“ Can you afford to buy a terabyte in storage space to scratch your developer itch? Chandu can’t.

They need an API. Something simple to spit data in, and suck data out. REST seems to be the one of choice lately.

It needs to be cheap. Um, free if possible. At least if you want Chandu to be able to build it, deploy it, and have it survive its first exposure on DIGG. If Chandu starts making revenue then you can get him to give you a cut, but the startup costs need to be near zero so that the developer ”itch“ can be scratched. Guys like Chandu (and most of the other geeks I know) don’t have much money to buy access to services.”

I’d agree, and specifically add something that the post refers to a few times; “a shitload (to borrow Robert’s technical term) of data”. Ideally with someone else to worry about collecting, organising and storing the stuff.

It’s not surprising, either, that Microsoft’s Chandu Thota identified Amazon’s S3 as the space to pull all of the pieces together right now. Seriously, where else could you go today for all the bits?

Technorati Tags: , , , ,

Looking forward to WorldCat.org

OCLC Worldcat logo

Those who don’t know me very well may be surprised by this, but I’m looking forward to the arrival of worldcat.org. I think it’s a visible step along the road to unlocking the chains with which this community has shackled its data for far too long. It’s the kind of thing I’ve agitated toward for a very long time, stretching even back into the mists of time before Lorcan Dempsey and I were at UKOLN. Way to go, OCLC.

Alane Wilson writes about it over on It’s All Good, pointing to a write-up by Paula Hane in Information Today, and to an earlier blog post of mine raising concerns about their closed silos of information.

Although undeniably a big step forward from the current Open WorldCat arrangement, there remains much to do, and a long way to travel before the antiquated, unnecessary and stifling data shackles are gone for good. Before getting down to the details, maybe a brief diversion into allegory would help?

Once upon a time, an enterprising farmer in Dublin recognised that his peers could not afford to look after all their grain. Working with those around him, Farmer O’Shelsey built huge silos on his land and rented out space in them to farmers in the surrounding area. Over time, the grain storage business grew and grew, and Farmer O’Shelsey gave up farming to concentrate upon collecting grain from further and further afield. A growing number of farmers saw the value in a safe store for their grain, and paid a lot of money every year to ship truckloads of the stuff to Ohio. The O’Shelsey’s recognised the value of the grain they held, and used money they received from farmers to build a huge wall around the silos. One of the younger O’Shelsey’s, a keen horticulturalist, spent some of the family fortune to plant many beautiful flowers around the base of the silos, and the family made yet more money charging admission to the Walled Garden for drivers bringing grain lorries from far-off farms.

Times changed, though, and it became increasingly feasible for grain stores to be built cheaply and effectively at each farm, and for farmers to work together to find completely new markets for their grain. O’Shelsey’s old silos and complex infrastructure were geared towards gathering all of the grain in Dublin, and shipping it from there to those willing and able to buy it. They simply couldn’t cope with new demands.

Facing calls to ‘open the silo’, the latest member of the family, Jay O’Shelsey, had a brainwave. He sent one of his team climbing up the outside of the tallest silo, where he cut away part of the wall and replaced it with a window. Now, they argued, everyone could see the grain they were paying to have stored in the silo. Going a step further, and fully embracing the new-fangled InterWeb, Farmer Jay pointed a webcam at the window, and set up a web site. Farmers could now visit the website and watch their grain rotting before their eyes on the screen. Those farmers who still clamoured to get their grain out of the silo so that it could be used in new and interesting ways (like growing crops) were offered a further innovation; Farmer Jay would use some of the money they had paid him to give them a ‘free’ television set. This could be set up in their own fields, so that passers-by could watch pictures of the grain silo. It didn’t quite seem as good as watching crops grow, though. Some of the farmers began to grumble that Farmer Jay had found a cheap job-lot of identical televisions from WalMart, when many would prefer to be able to choose the make and model for themselves. Others continued to call for access to their own grain, and marched on Dublin chanting “Let us grow grain, not televisions.”

Outside the farming community, bread makers like Giggle and YeeHaw!, and the Amazing flour factory began to realise that their previous grain supply deals with O’Shelsey were less useful than previously. So many farmers were now storing their grain locally or in other grain stores rather than paying to ship everything to Ohio. Consumers in the shops were increasingly conscious of their health, and their ability to assert choice. Rather than pay for produce sourced from O’Shelsey’s huge silos, they began to opt for locally produced goods, and niche products such as organic flour or GM-free bread. More efficient ways of working and significantly lower costs meant that third parties were able to offer standard grain and all these niche options to farmers and to bakers, flour factories, and anyone else who wanted them. The costs had shifted to such a degree that farmers were able to participate and contribute their surplus grain without charge. They even received some bread and other products back, in return.

What could the O’Shelsey’s do? Their huge family cost a lot to feed and had expensive tastes. They employed an army of people to issue bills to farmers, and a second army to count every grain of grain as it arrived in Dublin. Their machinery, although new, shiny, and well-maintained, had its expensive roots in an earlier era of centralisation and control.

This next evolution in WorldCat, then, is a good step forward. But it’s still on the basis that libraries have to pay, up front, to be able to give their data to WorldCat. With that barrier, how can WorldCat ever become comprehensive? It’s also not very open. An HTML fragment that you can drop onto a web page in order to direct a search back into the WorldCat silo is one thing. A suite of open, accessible and documented APIs, capable of facilitating integration and reuse in a plethora of genuinely useful ways is something else entirely, and within reach. One use of that might very well be in providing the HTML search box, but it’s also capable of so much more.

It surely isn’t for OCLC to decide how libraries, their patrons, and the wider biblio community make use of data sourced from member libraries. Nor is it for OCLC to constrain those uses by providing all the applications themselves. A more innovative, scalable, flexible and fair approach is to provide the underlying technology components and to encourage others to build with them. Smart people in places like OCLC’s Office of Research may very well demonstrate some of the things that this approach makes possible, and OCLC may even decide to productise some of those tasters and provide them to the market for a fee. So, though, could anyone else. And libraries would be free to choose the product that best met their needs. Costs would fall. Choice would grow. Innovation would bloom.

Taking a quick look at some of the discussion around worldcat.org would suggest that I am not alone in this belief;

“Several librarians pointed out that the question of having to subscribe to WorldCat on FirstSearch is a sticky one. O’Neill commented: ‘Many libraries, like Santa Monica, have subscribed to FirstSearch for years and use it for our ILLs [interlibrary loans]. For us it’s not a problem. The State of California set up a subscription to WorldCat for California libraries a year or so ago so that less well financed libraries could offer it to their clients. SMPL had already paid for that service so we saved a little money when the State picked it up. I suspect that all libraries will have to contribute to supporting this type of subscription sooner or later—unless the State can find the funds.’”

(Paula J. Hane, “OCLC to Open WorldCat Searching to the World”, 17 July 2006)

Why should any State or library ‘have to’ pay in this way?

“A lot of metadata in library systems is not ‘Open Access.’ This makes it much harder, less efficient, and expensive to manage… But we have about 1000 ETD MARC records in WorldCat. Technically, it wouldn’t be hard to write a script that updates all of these URLs, but WorldCat is locked down. We can’t gain access to the database to automate this process. We have to use Connexion, which was designed strictly for humans to interact with. OCLC does NOT want to share its metadata for free, that is how they make their money. Although technically we could update these ETD records quite efficiently, OCLCs security apparatus prevents us from doing so.”

(Brian Surratt, “Take my metadata, please!”, 22 July 2006)

“They have to pay for the right to be in there via a subscription to WorldCat. This means that if a library does not subscribe, they don’t get a link to their holdings. While, I understand OCLC wants to make a profit (even though they are a nonprofit cooperative), this program ends up harming libraries that are not paying for a subscription”

(Edward Corrado, “OCLC to Open WorldCat searching to the world”, 17 July 2006)

So. WorldCat.org is a step forward, and I’m sure I’ll find it useful. I’ll get really excited, though, when anyone can (freely) contribute their own data, anyone can (appropriately) use and reuse the data, and there is a far more wholehearted embracing here of the reality that OCLC infrastructure is a piece in a wider and essentially uncontrollable puzzle, rather than a black hole that sucks all data and clicks to Ohio…

Forget some stop-gap ‘destination’ web site, and a few HTML search boxes. The aggregate data to which WorldCat could facilitate access have the potential to dramatically change the ways in which libraries are exposed to the information-seeking world. To get there, though, requires us to ask some hard questions, and equally requires OCLC management to make some hard decisions. Are you global or aren’t you? Are you open or aren’t you? Can you drop your current restrictive charges to contribute or truly use, and instead find ways to leverage the value of the aggregate to generate a reasonable revenue? Are you interested in seizing huge opportunities to support a sustainable explosion in the visibility and use of data from libraries, or are you applying lipstick and sticking plasters to a house of cards, whilst the breeze increases in strength all around? Open up, for real, and I’m behind you all the way, for real. Keep taking baby steps, and I’ll keep asking for more.

Technorati Tags: , , , ,

xISBN to become a a production-quality service

In a comment to my previous posting - about how the failure of the xISBN web service, provided by OCLC’s Office of Research was at the beginning of a series of issues that rippled through LibraryThing and LibraryThingThing, which even led to highlighting a defect in the current version of the Firefox web browser. – Eric Hellman gave an insight in to OCLC’s plans for this widely used useful service.

I direct the business unit at OCLC that has been charged with taking the xISBN prototype and developing it into a production-quality service.

Eric also gives an indication of the terms that the service will be made available under:

The current level of xISBN service will remain free; there will be enhanced levels of service and support that we feel confident will deliver excellent value for reasonable subscription fees.

In tune with my posting; If you want to use the service as it is now, it should be free; If you want to depend on a service you would expect enhanced levels of service and support. It remains to be seen what is meant by reasonable subscription fees. As would be expected, Eric doesn’t provide a release date for the new service – in his place I wouldn’t either.

When the production-quality service does become available, it will be a welcome addition the growing set of distributed dependable Library specific Web Services along side the Talis Platform APIs

Technorati Tags: , , , , , ,

Reliable Dependancy

If you have been monitoring the discussions on TDN and the LibraryThing Google Group today you will have been aware that there have been people beavering away in Portland, Maine; Birmingham, UK; and Dublin Ohio over the last 24 hours or so madly trying to fix broken stuff.

Firstly the very useful xISBN Web Service from OCLC Research stopped working, I first noticed this when users of the resently published LibraryThingThing Firefox extension reported that it had stopped working. After some analysis I identified that the cause was that the LibraryThingThing code didn’t cope too elegantly when it’s calls to xISBN did not return the XML it expected.

As LibraryThingThing is an extension that only functions when a Web page from LibraryThing.com is displayed in the browser, I obviously needed to access LibraryThing whilst fixing and testing my code. LibraryThingThing also uses the ThingISBN web service. This is when I discovered that LibraryThing.com was also broken, and Tim Spalding was up in the middle of his night fixing that. Eventually, in the middle of my day, LibraryThing came back on air and my code was tested and an update to LibraryThingThing which could cope with an off-air xISBN service [which it still was] was soon released.

Just when I was starting to relax it became clear that a known defect in the Firefox extension update code was causing some LibraryThingThing users problems – but that is another story which there is at least a temporary work-around for!

So is this tail of interdependency woe a lesson in why not to use web services? Some might say it is – You wouldn’t get that trouble if you produced all the functionality, and ran it in house! – I would disagree.
Analyzing what when wrong with which over the last 24 hours, shows that the key service which started these problems is a service provided by the Research arm of OCLC as “As an experimental project of OCLC Research, this service is available without charge or guarantee. My LibraryThingThing extension is an “example of mashing together Web Services” from the Talis Developer Network. LibraryThing.com is an excellent, but nevertheless Beta site.

The first observation one might draw is that it is amazing how [normally] well these services work together considering that there was no working together to produce the result. Each Web Service author published simple documentation that was picked up the others and coded against. No training courses, no voluminous manuals, no synchronized development environments.

My second observation is that a useful tool, or service, is useful if it is available most of the time. The Internet, especially from its early modem days, has taught us to expect the occasional glitch or web site that is off air today, people only tend to comment if it is still off air the next day. There are exceptions to this though. If Amazon, or Google, or eBay go away for more than a handful of minutes, the discussion groups light up. It is the same for the Web Services these organizations provide. Why? Because people depend on these services to do their business – they are not Research/Beta Services like xISBN. Would you build a business that depended on xISBN – not unless it moved from being a Research service to become a reliable supported service.

So the moral I draw from recent events is that If you are going to depend on a Web Service it must be reliable. On the flip side, if you want people to use your Web Services, make them dependable.

Gone are the days of signing up for, probably unenforceable, Service Level Agreements for access to proprietary services. Take a look at the terms of use for the Amazon Web Services API – in legal terms “consumer beware“. But in practical terms, how long would even the massive Amazon last if they could not provide a reliable dependable service. Commercial self interest is the best SLA you can get from any organization.

So if you are going to depend on something, make sure it is reliable, supported and dependable. That is why there was a long pause between the initial announcement and research prototypes for the Talis Platform and the release of the APIs for it. We knew that consumers of the services would want to depend upon them, so that meant production quality supported services.

If you think I’m advocating that you should only ever use established fully robust Web Services and no Betas, you would be wrong. Apart from the core services of your application for which you would consume services such as Amazon’s or the Talis Platform; there is a great deal of functionality that can add value to your processes and your users that it is not operationally critical.

Who would complain if reader reviews or ‘who bought this also bought that’ disappeared for a few hours from the Amazon site, as long as you could still buy your books?

The Web Services/Web 2.0 world has changed people’s attitudes. Expect 100% reliability for core functionality, but for the nice-to-haves, the fun stuff, expectations are lower and Beta services are accepted. Of course what then happens is that new things become so useful and expected, that the service that provides them transitions in to a reliable/dependable service because so many use it. This cycle repeating its self over and over.

So what was todays experience about? Just early days in an a step along the way.

Update: I see xISBN is back, well done in Dublin!

Technorati Tags: , , , ,

Hear from your library

I was listening to the latest Library 2.0 gang podcast earlier today which was discussing mashing up the library. John Blyberg, from Ann Arbor District Library made a very interesting observation that everyone has iPods these days so why shouldn’t library data be made available on the iPod. This idea is based on a fundamental principle of Web 2.0 – “deliver value to users, when, where and in the form that they require it.”

He explains that we have many: “conduits to access stuff, so much stuff in our library is online, and we have tools and gadgets at our disposal, us and our patrons have them integrated into our daily lives” he mentions that this is a “hugely untapped resource, if we could figure out a way to download their holdings, their checked out items to their iPods, everyone has an iPod, why not take advantage of it? It’s about looking at all of these conduits and point of entry that people use, and giving the library presence. Everyone has a different context with which they use the library, and it’s not going to always be the same, it used to be the OPAC but this is going away.

I decided to take this idea and see what I could come up with. The obvious way to deliver data to iPod + iTunes would be through a podcast. The data would come from the View My Account Web Services API of Talis Keystone and I just needed to figure out a way to synthesize speech to an audio file which mashed up the Web Service API to produce some suitable output. I decided that I would just, for the purposes of a simple prototype write some software which generated an audio file containing some prose auto generated in real-time from the data coming back from Talis Keystone. This would give a short summary of a borrowers account, which includes their name as known by the library, the number of reservations, loans, ILL’s and bookings they have and also how much they owe in charges.

Here at Talis I have testing and development environments of Talis Keystone which I can use to test the services and also to create client applications which consume the services. In less than an hour, using the Java Speech API and an API called FreeTTS I was able to knock up a simple prototype which consumes the Talis Keystone View My Account API to produce this output file. The audio is currently very robotic, but it proves the concept that we can deliver library data in this manner to an iPod.

Once you have the audio, of course, you’re not just limited to sending it to an iPod. Why not offer an alternative to the SMS-based capabilities of our current Talis Mobile product, and include the option to have the library system phone you up and *speak* to you about routine library interactions such as the recall of a book?

A Talis Keystone environment will soon be available in a sandbox area of the TDN for other developers to try out solutions, test their ideas and create similar innovations by consuming the Web Services.

Library 2.0 Gang talks about mashups

Listen to the Library 2.0 Gang

I’ve just uploaded last night’s Library 2.0 Gang conversation to the Talking with Talis podcast site.

We talk about mashups and libraries, both in the context of the ongoing Mashing up the Library competition, but also reaching beyond that to consider some of the longer term possibilities around providing meaningful access to library resources.

Have a listen, and share your thoughts on the forum.

Technorati Tags: , , , , , ,

OCLC interprets the mashup

I just posted an item to the TDN, reporting on OCLC’s “Research Software Contest”, and flagging a significant difference between their approach and that adopted by the team putting on Mashing up the Library.

It’s great to see OCLC supporting innovative approaches to library data in this way, and they certainly have some significant resources that the sector continues to benefit from. It strikes me as unfortunate – and unnecessary – that they have opted to force entrants to use OCLC resources, though. Is it one further indication of an increasingly out-dated attitude to libraries, library data, and the place of both in the wider world of information discovery, use, and reuse? Or is it just a silly mistake that they could fix before the closing date? I’m hoping it’s the latter.

Libraries do not exist in a vacuum. Nor do library data. We are surrounded by other sources of information, and we are surrounded by other ‘channels’ through which library resources should be brought to both actual and potential users.

Back in the dim and misty depths of the 1970s, there was huge value to libraries and their members in collaborative cataloguing efforts, and in the aggregation of those records for mutual benefit. The technology and processes were complex and expensive, and subscription-powered ‘co-operatives’ such as BLCMP (the precursor to Talis) and OCLC were an obvious approach to achieving effective economies of scale. Today, the world is a very different place. Today, libraries need to be visible not only to one another, but in a host of different contexts. Today, library catalogues are interrogated by their ultimate beneficiaries rather than (just) by librarian intermediaries. Today, various local, regional and national agendas mean that library information needs to be available alongside that from other libraries in the same city, other libraries of the same ‘type’ regionally, nationally or globally, and even in the context of non-library sites such as Amazon or LibraryThing. Every single one of those important agendas of today is significantly impeded when the visibility is derived from queries into a closed and expensive club, or a subscription-driven system such as WorldCat. Wouldn’t it be better if any library, anywhere, could contribute data about themselves free of charge, see information about their peers free of charge, and benefit from a robust and scalable Platform of services geared towards integrating library information with similar data from their peers, or plugging any of it into the non-library sites where our users really exist online? Isn’t it fortunate, then, that such a thing is possible, today?

Closed and restrictive used to work. It really doesn’t anymore, and if we as a sector don’t embrace open and inclusive, then people will simply go around us to information providers that do.

Closed silos of ‘owned’ data – bad thing

Closed competitions, that try to limit creativity and innovation – also bad.

Maybe we’ll get our collective act together for next year, and run one competition to jointly champion the cause of increasing the visibility and utility of library information? I’d happily discuss the idea with OCLC, SirsiDynix, or any of the other leaders in this space. OCLC really would have to drop their silly restrictive practices, though.

I have sufficient belief in the quality and value of OCLC services like xISBN to be sure they’d figure in a significant number of entries. If I can, why can’t OCLC believe in their own stuff enough to let the market decide whether or not to use them, rather than forcing them upon competition entrants and others?

Technorati Tags: , , , , , , , , , , ,