Panlibus Blog

Archive for April, 2007

Any clearer on catalog copyright?

In my recent posting When is Data not Data? I raised the issue of catalogue records and if they were subject to copyright.

I’m going to dip my toe in legal waters here and I’m no expert on intellectual property laws, so correct me if I’m wrong.  It is my understanding that you cannot copyright a fact; a fact such as J. K. Rowling wrote Harry Potter and the Philosopher’s Stone; or the fact that it was published in 1997; or the fact that it has an ISBN of 0747532745; or the fact that NYPL has a copy in its Reference only section.   In fact [excuse pun] I fail to see anything in a library catalog record that is not a fact, and therefore come to the conclusion that catalog records are not copyrightable.

The post attracted some comments, including a clarification on a couple of points from my colleague Rob Styles.  But apart from a feeling that much of a record consists of facts which in themselves are not copyrightable, it is not clear if things like subject headings or notes fall in to that category or not.

The issue has been raised again, this time on the NGC4LIB mailing list in another thread that is a spin off from the one I mentioned earlier.  Tim Spalding provides Nine legal arguments that OCLC has no copyright over MARC records  which make interesting reading.

So are we any clearer – maybe a bit, but as most of us are not experts we need to circumspect on these things like Jonathan Rochkind is.  What is needed is someone who understands the library world and its history, and is an expert on copyright law. If you are that person put your hand up!

(Photo taken by herebox displayed in Flickr)


Technorati Tags: , , , , , ,

One Catalog to rule them all…

A couple of times in the past I’ve highlighted the NGC4LIB mailing list as a place of great discussion.  I thought it had gone a little quiet over the last few weeks.  Then last week Tim Spalding of LibraryThing fame woke the list up with, what on the surface seemed, a simple enquiry.

Does anyone know of examples of a fully-spiderable OPAC?

It’s my contention that libraries would do well in Google and even Google Local if they were spiderable. I’ve seen the Lamson Library catalog do very well—tops in Google, even without mentioning Plymouth State, but it gets a LOT of push from its association with WpOPAC.

But I need some examples. Anyone?

In clarification of his question, Tim went on to use the analogy of being able to search for ‘Pizza Portland, ME.‘ which works for finding what he wants in his locality, whereas [because OPACs are not spiderable] ‘Book Title, mytown’ doesn’t.  I think Tim is misrepresenting the problem – the library equivalent of his pizza search should be ‘Library Portland, ME‘, as [from a search engine point of view] pizza is mapped to places that sell pizza.  Try searching for ‘“ham and mushroom” Portland, ME‘ and see what you get, a few pizza joints but mixed in with a lot of other stuff.

The problem is that search engines are very good at identifying a page that references the thing you are looking for, but less good at finding the thing itself.  I agree with Tim that if OPACs were search engine spider friendly you may well get better hits than we currently do, but there again how many OPAC interfaces are mapped directly to an address so that the search engine can do its geo-location magic – very few.

Later on in the NGC4LIB thread, the OCLC Find-in-a-library service was mentioned as being the answer – but only the answer if the library funds that service by being a member.  There are many stories even in the US of the ‘find’ being a University in the next State.  Try it outside the US, and even with a modicum of geographic knowledge you could get the paranoid feeling that the results are there to taunt you instead of being helpful.

All very well to criticize what we have now, but far more valuable is the search for a solution, which is what I believe is behind Tim’s question.  The problem I believe is that there is not a library equivalent to Amazon or Wikipeadia to point your searches at.  How many of us, let alone the world beyond libraries, link to a book reference on a library page from their blog. No we all point at Amazon etc.

To be fair, unlike Talis [which has had static URLs for book references in its Web OPAC since 1995], most OPACs are not linkable anyway.

What we need are a small number of places on the net which have not only references to all the books held by libraries, but also an easy low-cost [preferably zero cost] way of linking in with individual library holdings.

Casey Durfee, in his contribution to the thread (which by now had morphed to have the title Spiderable OPACs and the elephant in the library lobby – the elephant in this case being OCLC) said:

There are a lot of reasons to be wary of a “one catalog to rule them all” solution being sold by a single vendor no matter how appealing the idea might be. [my emphasis]

There seems to be some sympathy with the idea of a centralized [spiderable] place to search for all bibliographic references, but warning bells ringing around the commercial and/or political ramifications of seeding that with any particular body.  Some people chipped in with “sounds like OCLC to me” comments, but were soon answered by those who are not, and cannot afford to be, members of OCLC. 

For something like this to work there are three hurdles to overcome:

  • Firstly, and most obviously, cost.  For the contributing libraries the only differentiation between those that have and those that have not [got their holdings represented] should be their willingness to contribute.  Any cost hurdles almost by definition lead to an information visibility apartheid – those that cannot afford to contribute, disappear.

    Albeit UK only at the moment, our own Talis Source example ably demonstrates that if you remove the costs of contributing to a resource that can deliver value to library users [holdings contributed to the Talis Library Platform, via Talis Source, are publicly searchable in Talis Cenote] libraries are motivated to openly share their data.

  • Secondly, fear of loss of data ownership.  There has been much concern over the years around evil people making money out of freely shared data.  The answer is licensing.  Creative Commons has shown that you can share your creative works for the benefit of all whilst still precluding the possibility of others making money from your efforts.

    The Talis Community License is our contribution to the community to deliver the same protection to aggregations of data that may not individually be protectable in the same way.

  • Finally, cataloging practice.  Getting international agreement about the correct form of catalogue record to describe an individual book has up until now been akin to trying to herd cats [click the link, it is amusing, even if the IT company it is advertising is being a bit optimistic I think!].

    This is not surprising as a main job of a cataloger is to catalog a book in their institution, and not necessarily to help a user on a different continent find it.

Have I got any answers to the issues I’ve referenced?  I only wish I had, but what is gratifying is that people are engaging in the debate and starting to dismiss some of the ‘obvious’ solutions.  I believe the answers lay with Openness of data, licenses, and minds.  I also believe that the enormous amounts of money changing hands for looking after and allowing access to metadata is a model that will become untenable in the future.

Overcome some of these issues, learn from our experience in the UK, and we may end up with “One catalogue to enable them all

(Photo taken by generalnoir displayed in Flickr)


Technorati Tags: , , , , , ,

Bess Sadler talks with Talis about eIFL and Library-in-a-box

In our latest Talking with Talis podcast, I talk with Bess Sadler of the The University of Virginia Library, about her involvement with eIFL and the Library-in-a-box project.

Library-in-a-box is the first project of the Free and Open Source Software division of eIFL. It seeks to enable the development and use of Open Source Library Systems in developing and transition countries.

Listen Now | Download MP3

[36 mins, 25Mb]

 During the conversation, we refer to the following resources:

This conversation was conducted by telephone on Monday 23 April 2007, edited in Audacity. (Picture of Bess, at Code4lib 2007 posted by nengard in Flickr)

Technorati Tags: , , , , , , , , , ,

More on Worldcat Local

Having been early to comment on the OCLC announcement that they were to pilot Worldcat Local last week I thought it might be worth checking out what others had to say on the matter since then.

For such an announcement I was surprised to see how little fuss it had created in the library blogosphere.  After all it told us that “OCLC will test interoperability with systems used by participating pilot libraries, including Innovative Interfaces, SirsiDynix, and ExLibris Voyager.” – effectively opening up those vendors’ systems to have all their public facing functionality replaced by an OCLC software service.  Are these vendors intimidated by this bold encroachment on to their patch, or do they welcome it? Are they going to make it easy for OCLC or not? I’m sure their customers would be very interested to discover their views on this.

The possibility of OCLC providing the OPAC surely must raise a couple of questions from the libraries – “Do I get a support discount from my vendor if I don’t use their OPAC?” and “Who do I ring when it starts misbehaving?” are just a couple that come to mind.

Nevertheless my search did turn up some interesting comments.

Tim Spalding on the LibraryThing Blog gets quite heated about it:

That’s the news. Here’s the opinion. Talis’ estimable Richard Wallis writes:

“Yet another clear demonstration that the library world is changing. The traditional boundaries between the ILS/LMS, and library and non-library data services are blurring. Get your circulation from here; your user-interface from there; get your global data from over there; your acquisitions from somewhere else; and blend it with data feeds from here, there and everywhere is becoming more and more a possibility.”

I think this is exactly wrong. OCLC isn’t creating a web service. They’re not contributing to the great data-service conversation. They’re trying to convert a data licensing monopoly into a services monopoly. If the OCLC OPAC plays nice with, say, the Talis Platform, I’ll eat my hat. If it allows outside Z39.50 access I’ll eat two hats.

I agree with your main point Tim, and I believe your millinery assets are safe at the moment!  I don’t think OCLC are not about to offer Open Web Services for all to consume, but nevertheless getting the OPAC from an external Software-as-a-Service supplier is a major crack in the traditional monolithic walls of ILS/LMS supply.

Jessamyn West has some interesting thoughts on her How WorldCat solves some problems and creates others posting.

…further blurring the boundaries between book data and end users services using that data…  …Meanwhile WorldCat still tells me that I have to drive 21 miles — to a library I don’t even have borrowing privileges at (Dartmouth) — to get a copy of the Da Vinci Code when I know that I can get a copy less than half a mile down the street.

In comments to Jessamyn’s posting, GinaP characterises the ‘digital divide’ as between those libraries that can afford to pay the OCLC subscription and those that can not – a problem apparently solved in Idaho with a State-wide agreement which allows all libraries to avail themselves with OCLC services, the smaller ones only paying $300/year.  (I wonder what the larger ones pay?)

In a following comment GeekChic says:

The two consortia that I used to work with in South Texas could never afford OCLC membership (and they will likely never be able to afford it). As a result, patrons using “World”Cat are directed either to academic libraries or to public libraries that are a four hour drive away. I now work in Canada and there are very few Canadian public libraries that are members of OCLC – so the “World”Cat label really sticks in my craw.

Echoed by David Bigwood  of the the Lunar and Planetary Institute:

….we can’t afford OCLC membership. Even though it would benefit scholars around the world. Patrons come from Egypt, South Africa, Australia, and so on to do research here.

Much of what I catalog in not in OCLC, I know because I check Open WorldCat. Yet $1200.00 a year for membership is not in the budget. That is another journal we would have to cancel.

The LibrarianInBlack is employed by one of the libraries that OCLC is using as a Worldcat Local pilot through their consortium, the Peninsula Library System, so I was interested to check out what Sarah had to say about it.

None of us has actually seen any part of the product yet–we’re just going on what we’ve been told.  We are hoping to see the product in action soon, and are told that we will see it before it is launched live on our site.  This project has been a huge deal for our consortium and libraries, and none of us has been able to talk about it for months.

Ah well maybe we will all have to wait to see, and hear more opinion about it.  

Other commentators have been:

  • ResourceShelf – Is this the beginning of the end for the local catalog from OPAC providers?It’s always exciting to see new things/ideas but we wish that OCLC would also get other longtime WorldCat issues up and running correctly.
  • Peter SuberApart from the way this new service supplements the standard library OPAC, I like the way it ranks items with the most accessible first.
  • Information and the FutureThis would all seem to have pretty major implications for our thinking about the OPAC, the local catalog, ILS software, electronic resources and consortial catalogs like I-Share’s Universal Catalog.

An opening up of Library systems, an evil plan to capture the library services market, just a bit of added value for those that can afford to be part of the ‘World’, or just a bit of interesting news? – only time will tell.

Technorati Tags: , , , ,

WorldCat Local to be piloted

Last month I commented on the possibility of a WorldCat Local. 

Yesterday OCLC announced that they are to pilot the service, based upon, starting with the University of Washington later this month.

Through a locally branded interface, the service will provide libraries the ability to search the entire WorldCat database and present results beginning with items most accessible to the patron.

A development that those libraries who use their funds to be part of the OCLC club, will be watching with interest.

Yet another clear demonstration that the library world is changing.  The traditional boundaries between the ILS/LMS, and library and non-library data services are blurring.  Get your circulation from here; your user-interface from there; get your global data from over there; your acquisitions from somewhere else; and blend it with data feeds from here, there and everywhere is becoming more and more a possibility.

Communities and organizations both large and small, expensive and/or closed, open and/or free, organized or ad hoc will all play their part in this.  No doubt we will all be watching with interest the results of OCLC dipping their toe in this particular water.


(Image posted by axlotl in Flickr)


Technorati Tags: , , , ,

When is Data not Data?

Father to his son – “When is a door not a door“.  Son – “I don’t know, when is a door not a door?” Father [with triumphant tone in his voice] – “When it’s ajar!“.  Son – “How can a door be a jar?“. Father – “No, not ‘a’ jar, ‘ajar’  – it means slightly open – err it’s a joke“. Son wanders off muttering something about parents being weird.  –  Isn’t communication wonderful – when it works!

So what has that scenario got to to do with data then? –  Well for some inexplicable reason the following popped in to my head yesterday, whilst discussing why libraries are so protective about their Marc records.

When is data not data? – When it is metadata!

It came to mind again today when I read the post on Open Libraries – Open Data: What Would Kilgour Think?.

The New York Public Library has reached a settlement with iBiblio, the public’s library and digital archive at the University of Chapel Hill, North Carolina, for harvesting records from its Research Libraries catalog, which it claims is copyrighted.

Heike Kordish, director of the NYPL Humanities Library, said a cease and desist letter was sent because a 1980s incident by an Australian harvesting effort which turned around and resold the NYPL records.

Simon Spero, iBiblio employee and technical assistant to the assistant vice chancellor at UNC-Chapel Hill, said NYPL requested that its library records be destroyed, and the claim was settled with no admission of wrongdoing. “I would characterize the New York Public Library as being neither public nor a library,” Spero said.

It is a curious development that while the NYPL is making arrangements under private agreements to allow Google to scan its book collection into full-text that it feels free to threaten other research libraries over MARC records.

It is interesting that Jay Datema chose to contrast the apparent contradiction of NYPL allowing Google to scan its books whilst going all legal to prevent distribution of its catalog records. 

Well they are different things – the books are the data [albeit stored in paper form before being transformed in to a digital form by Google] that they are custodians of; whereas the catalog records are the metadata about what they hold.

If anything this makes the contrast even more perverse. NYPL are passing on copyrightable information [the text of the books] whilst aggressively protecting the facts about those books.

I’m going to dip my toe in legal waters here and I’m no expert on intellectual property laws, so correct me if I’m wrong.  It is my understanding that you cannot copyright a fact; a fact such as J. K. Rowling wrote Harry Potter and the Philosopher’s Stone; or the fact that it was published in 1997; or the fact that it has an ISBN of 0747532745; or the fact that NYPL has a copy in its Reference only section.   In fact [excuse pun] I fail to see anything in a library catalog record that is not a fact, and therefore come to the conclusion that catalog records are not copyrightable.

So why the fuss? What is so special about the NYPL records that caused them to ask iBiblio to cease and desist in harvesting those facts? How do they differ from the catalog records from all the other libraries?  As far as I can tell the only difference might be the fact that NYPL holds one or more copies – and why would a ‘public’ [or any library for that matter] want that kept secret?

As I said in my post yesterday:

It is about time that the Libraries of the world moved on from jealously guarding the metadata about the knowledge that they hold, and let their librarians get back to guiding people towards and helping them interpret and interrelate the knowledge itself.

Libraries are custodians of the wealth of human knowledge, a wealth that almost certainly can be multiplied if it is known where parts of it are located and how they relate with each other.

Jay closes with:

While the purpose of releasing library data has not yet reached consensus about what will be built as a result, it can be compared to Netscape open-sourcing the Mozilla code in 2000, which eventually brought Firefox and other open source projects to light. It also shows that the financial motivations of library organization by necessity dictate the legal mechanisms of protection.

It might not be clear at the outset, what good would flow from opening up the world’s library metadata, but it is a fairly safe bet that good would come out of such a move.

To paraphrase my colleague Rob Styles from his lightning talk at the code4lib Conference, we need to stop behaving like four year olds and realize the benefits of openly sharing. 

NYPL quote a previous experience when an organization harvested and then sold their records.  The fear that it might happen again is obviously driving their recent actions.  Whether reselling harvested records is right, wrong, or illegal is a question in itself, but if those records were openly and freely available to all to use, who would be able to build a business on selling what you can get free?

Openly sharing catalog records is good for libraries, good for researchers, good for library users, good for us all.  If you have trouble sleeping at night because you are worrying about people making money out of catalog records, be assured openly and widely sharing metadata is not good for them.  

If in the end you still have residual concerns about what people might do with metadata that they source from you, there are always open licenses such as those provided under Creative Commons, or our own contribution to the licensing debate the Talis Community License, which can protect ownership and/or the ability to charge without restricting open sharing.

A win, win, win, win situation then – so why don’t we do it?


(Photo taken by Daquella manera displayed in Flickr)


Technorati Tags: , , , , , ,

A view of what it will be like when you take your head out of the sand.

Take a read of this – How Google Books is Changing Academic History. (Thanks to ebyblog for the heads up)

You may not be one of the folks with your head in the sand; who are not endlessly arguing about the suitability of Marc for 21st Century cataloguing; who are not against the march of Google digitization, because it is just plain wrong; who think if you build the most functional OPAC in the world people will flock towards it by magic.

You may not be, but I still bet you didn’t realize quite how useful Google Book Search is already becoming.

Hang around for a few more half years, watch out for some forward thinking deals in the publishing industry that emulate recent ones in the music world, and things could be very different.

A recipe for despair?  No just a heads up that things in the world of the Web move a heck of a lot faster than the world of Libraries are used to moving. 

We will have to shout-up, Open-up, and keep-up to stay connected and relevant.  Libraries and librarians have, do, and will have much to offer the world if we don’t get bypassed as being out of touch, out of date, and not relevant.

We have masses of high quality metadata stored in obscure databases or behind barriers created by cost, licensing, and/or institutional pride and selfishness.  If that data from the libraries of the world, which has been around for years, had been as easy to openly search as it is to traverse the texts in Google Book Search, the author of the post in question would have found his twenty extra books [and probably many more] a long time ago.

Let us open up our silos of data and share it with all; allowing it to be mined so that the relationships between that data can also be exposed to add even more value; and yes relate it to the full-texts that Google are amassing.

It is about time that the Libraries of the world moved on from jealously guarding the metadata about the knowledge that they hold, and let their librarians get back to guiding people towards and helping them interpret and interrelate the knowledge itself.

Rant over…..


(Image posted by spcoon in Flickr)


Technorati Tags: , , , ,

LibraryThing introduces OPAC widgets – a trend?

LibraryThing the personal ‘Catatalogue your books online‘ site provides a sneak preview of its LibraryThing for Libraries widgets.

LibraryThing for Libraries is composed of a series of widgets, designed to enhancing library catalogs with LibraryThing data and functionality. The achievement is that the widgets require NO back-end integration.

These widgets require you to “Just add a single JavaScript tag, and one tag for every widget you want to display and we do the rest.” to the pages that display your OPAC, and you too can have Similar Books & Related Editions listings adding interest to to your OPAC display.

Tim Spalding from LibraryThing says, LibraryThing’s data is strongest in public library catalogs, it will be interesting to see how many sign up for his free trial, and upload a dump of their ISBN data.

Just add a single JavaScript tag” – I wonder how many library system managers will be comfortable/capable/allowed to do that? 

This is an excellent example of a trend that started with book-jacket feeds from the like of Syndetics and NBD, and continued with experiments with using extra data captured or contained within a library system – all to add value to the OPAC user experience.

Take these examples to their logical conclusion and you will end up with the catalogued data contained in in your system forming a small [but nevertheless critical] core of data presented to the user surrounded with data fed from all over the place enriching that user experience.  [Hopefully] library user heaven, but possibly a library system manager’s nightmare – how many little bits of JavaScript will he/she have to just plug-in; will there be conflicts between them; what will happen when one or more of the services doesn’t behave itself? 

I’m not trying to pour cold water on these developments, Tim should be applauded for driving forward on this.  Nevertheless, if the ability to simply create a user interface for your library by mixing data from many sources is going to become a reality for the majority of libraries, there is going to have to be some major developments in the implementation process.

Imagine [not unrealistically] wanting to enrich your OPAC display with, book jackets from a commercial supplier and an open source of images; recommendations based upon the borrowing patterns of libraries in your local geographic, or subject specialisum, consortium; user ratings from a combination of your own users and LibraryThing; brief reviews from LibraryThing; in-depth reviews from Revish; FRBR related data from OCLC; LibraryThing related data; links to author pages in Wikipedia; etc., etc.

Just rereading that last paragraph tells you it just ain’t going to be sustainable in the real world, especially if you need to be uploading files of ISBNs all over the place.

There need to be developments on a couple of fronts.  Firstly, being able to add many widgets in to your UI needs to be simple and understandable.  Maybe UI prototypes from SirsiDynix, III, and Talis will address this?  Secondly, an ecosystem needs to emerge atop of which these services can be delivered.  Standards, both agreed and de facto, need to be established so that the individual innovations as they emerge can be utilized without the supplier of the service, or the vendor of ILS/LMS it is enriching, being a great hurdles.

Experience tells me that Tim will spend far more time than he ever expected to, producing ILS/LMS specific versions of his widgets.  I also expect that the moment a library tries to add more than a couple of these sort of things to their UI, they are going to start wishing that they all operated in the same way.  These are crucial factors that will hold back the general rollout of such features beyond a few innovative libraries.

What is needed is a Platform for delivering such services.  The Talis Platform is  an exemplar for augmenting data with enrichments from many dissimilar sources in a simple to integrate way.  It is also a Platform that hides the complexity of maintaining that data. In addition the complex relationships between the data suppliers/providers their data and the consumers of that data, both in commercial [where relevant] and licensing terms, is something that needs to be hidden from the consumers to ensure the wide adoption of these services.

Oh no – I hear you say – yet another Talis blog praising the wonders of the Talis Platform. Yes, and I make no excuses for it.  As others are starting to recognize there is something innovative and different about the Platform that I have yet to see demonstrated elsewhere in the library world.  – When I see other examples I will post about those as well.

In the meantime, good luck to LibraryThing, and others I know who are following similar roads.  It is these individual pieces of innovation that will collectively drive us all forward.


Technorati Tags: , , , , , , , , , ,

Information World Review does 2.0

Also gracing the letterbox this weekend was April’s issue of Information World Review, which has clearly been bitten by the 2.0 bug, big time.

Articles take a while to make it online, but you can doubtless find a copy in your local library – or subscribe for yourself.

Amongst a flurry of relevant content, Bobby Pickering’s ‘On course to meet the challenge’ includes the thoughts of some bloke called Miller/Millar from Talis,

“which made early running in the [Web 2.0] area.”

Look mum, I’m in the paper!

Technorati Tags: , , , ,

Radical Transparency in Wired

Cover15 04

My copy of Wired fell through the letterbox this weekend, weeks after those of you in North America finished reading yours, and included the usual mix of material.

As anyone who’s been watching Talis for a while will be unsurprised to hear, I was particularly taken by the three pieces on Radical Transparency, which echoes much of what we’ve been doing internally and externally.

I wonder if any of the senior managers at our peer organisations are passing the articles around the boardroom, thinking about whether or not this cool ‘new’ idea might have a place within their working practices?

Give us a call – we’ll show you how to do it…

Technorati Tags: , ,