Panlibus Blog

Archive for the 'digitisation' Category

Survival with the fittest: the story of a Google library partner

Universidad ComplutenseI hadn’t previously come across any of Google’s library partners, so it was great to listen to the experiences of Manuela Palafox from the Universidad Complutense in Madrid, Spain at the Eurolis seminar Doom or bloom: reinventing the library in the digital age. Complutense originally signed its digitisation agreement with Google back in 2006, and was the first non-Anglo-Saxon (her words) library to join the programme.

Based on books in the public domain, the agreement enables Complutense to offer universal free of charge full-text access to a large number of books. So I, for example, a former student from a Spanish university, can now explore a rich vein of Cervantes books without having to endure the punishing euro-sterling exchange rate.

In digitising Complutense’s public domain books, Google assumed all the costs of digitisation and transportation. Google also created an interface, something they do for all their library partners. In return, Complutense selected and provided the books, as well as technical staff.  The overarching aim was to offer access to the university’s library heritage. It was also perceived as an important part of selling the Spanish language abroad – providing access to the vast number of Spanish speakers in the world.

The process started with an analysis of the collection to determine how many books were out of copyright. They then catalogued 70,000 books and established selection criteria – publication year and physical condition – and formulated workflows and logistics for digitisation. Using PDAs, for example, the selection team stored details of the physical condition of books against the book barcode.

As a result of this herculean effort, thousands of Complutense’s digitised books are already accessible in Google Books. It’s possible to navigate directly to the full text from the catalogue record. There are also links enabling users to buy the book. This is truly how to extract optimal value from materials that were formerly languishing in the library. And even in the short time that they’ve been available, 34% of the materials have already been used.

Google logoMeanwhile, Jason Hanley, one of Google’s partner managers who spoke immediately after Manuela, seemed anxious to dispel a number of myths about Google and its work with libraries. On the predominance of English language materials, he pointed out that of all Google’s library partners, 8 are outside the US – 2 being in Japan and the rest, such as Universidad Complutense, in Europe. He also believed the predominance of language, linguistics and literature over STEM subjects to be surprising – I’m not sure why.

The question and answer session at the end, involving both Manuela Palafox and Jason Hanley, may have inadvertently answered the question of Google’s motives in this. It’s not the library world that should be afraid of Google – it’s the competing search engines. Google’s longstanding mission – to organise the world’s information and make it universally accessible and useful – has clear benefits not only to library partners such as Universidad Complutense, but to the library world as a whole, and to bibliophiles like me. But Google will be imposing limits on the availability of digitised materials for indexing by other search engines for a certain (undefined in this session) period of time, although Hanley denied that Google was trying to be exclusive (which came across as being more than slightly defensive).

The session was a clear window into the aims and experiences of a library partner, and maybe into Google’s motives as well… As one speaker from the floor noted, what are the chances of any other search engine being able to compete fully with Google in the foreseeable future?.

Europeana: Think culture

EuropeanaAiming high is rarely the wrong thing to do, in my opinion, and Jonathan Purday’s presentation, at the Eurolis Seminar Doom or Boom of Europeana, a digital library offering a single, direct and multilingual interface to cross-domain European cultural artefacts certainly wasn’t short of lofty aims. Europeana isn’t just about making library resources available, it’s about breaking down the cultural institution-based silos right across the European cultural sector, and in the process it has created an exciting online resource for the public, researchers and teachers and learners in education.

It’s easy for British people to forget the risk that the Google Book Project will overshadow non-English artefacts in Europe, and this has been an important concern since at least 2005, when the European Commission launched its Digital Libraries initiative. Initiatives such as Europeana are, in Purday’s words “making available the intellectual record of other languages”. And it will also “harmonise digitisation practices across Europe”. All good stuff.

It was also great that Purday acknowledged that every search now begins with Google, and that if you don’t find material, you think it hasn’t been digitised or it doesn’t exist. I and a number of delegates were left wondering at the end of the session, though, whether the full text of content in Europeana will be exposed to Google, and if Purday could come back on that point, that would be useful.

It’s worth mentioning that every single speaker at the Eurolis seminar mentioned the need to consider copyright harmonisation and Purday was no exception, but he probably deployed the most powerful arguments to support this. We can’t digitise at the scale now technologically possible, he argued, unless we reconsider and harmonise copyright, he said, and that the risk was of creating a “20th century black hole”, whereby we will be unable to represent the published output of “the most documented century” and we will end up with a distorted picture of the past as a result.

I would urge people to take a look at Europeana. The search interface is available in 26 languages, and in the next 2 years they plan to be able to translate search terms on the fly (currently only the interface is translated). Purday demonstrated a search on Don Quixote, which not only came up with an impressive range of book editions, but also images inspired by the work, plus videos, including a 1956 news broadcast in which Salvador Dali recreates a vision of Don Quixote at Moulin de la Galette. Europeana holds metadata in the central index and takes the user back to the original site to look at the full artefact, so decentralised and collaborative in a sustainable way.

Europeana is currently attracting 15,000 users a day. Purday is concerned, though, that most people interested in the site are over the age of 45. He plans to address this by creating an API so users can put Europeana into their own web space, although in discussions afterwards, people wondered whether such a measure would succeed in engaging younger people.

The EOD (E-Books on Demand) Network

ILI 2009Silvia Gstrein from University of Innsbruck spoke engagingly on Thursday at Internet Librarian International about the E-books on Demand (EOD) Network. Established in 2007, the network now involves over 20 libraries in 10 European countries (not the UK though – why not, I wonder). I loved hearing about this project – it seems to be meeting a real niche need.

Silvia explained very clearly how the network works. A user finds a book of interest on the online catalogue of one of the 20 participating libraries, and clicks to request to digitise the book.

If the number of pages is present in the metadata, then a price can be given immediately (member libraries set their own pricing).

The library then receives the order, and scans the book. The digitised copy is then sent to the central server at University of Innsbruck.

An email is sent to the user, and payment can now take place. Card processing is also managed centrally at University of Innsbruck.

The user can either download the PDF or have it sent on CD.

User can follow progress of the order throughout the process.

The service has attracted favourable feedback from users, who would nonetheless also be interested in 20th century, something that is not possible under current copyright legislation.

Silvia made the important point that many 15th – 17th century books would otherwise only be accessible via visits to physical libraries. The electronic library has largely freed up academics from spending inordinate amounts of time travelling around from one library to another amassing physical research materials, but not surprisingly, antiquarian books are lagging behind. And cultural artefacts are, of course, for sharing. I’ve got an early 17th century prayer book stored in an acid-free box in the wardrobe of my spare room where no-one else can see it, which is not ideal. One of the most amazing things about it is the graffiti written by what I imagine are rebellious choir boys down the ages. It would certainly be great to share the pleasure I get from looking at it. Maybe this is a candidate for some kind of citizen-cataloguing project, along the lines of Peter Murray Rust’s ideas.

Operationally, the service doesn’t pose any problems for the participating library, with the three top libraries receiving one request per working day on average. For some libraries who have never digitised their materials, it’s an opportunity to embark on digitisation, and they appreciate the guidelines and ready-made workflow provided by the EOD network.

So far about 3200 books have been digitised to 1900 customers, and the average price of an order is €50. €50 for 400 year old graffiti? A bargain, I’d say.