Panlibus Blog

Archive for the 'Google Book Settlement' Category

Survival with the fittest: the story of a Google library partner

Universidad ComplutenseI hadn’t previously come across any of Google’s library partners, so it was great to listen to the experiences of Manuela Palafox from the Universidad Complutense in Madrid, Spain at the Eurolis seminar Doom or bloom: reinventing the library in the digital age. Complutense originally signed its digitisation agreement with Google back in 2006, and was the first non-Anglo-Saxon (her words) library to join the programme.

Based on books in the public domain, the agreement enables Complutense to offer universal free of charge full-text access to a large number of books. So I, for example, a former student from a Spanish university, can now explore a rich vein of Cervantes books without having to endure the punishing euro-sterling exchange rate.

In digitising Complutense’s public domain books, Google assumed all the costs of digitisation and transportation. Google also created an interface, something they do for all their library partners. In return, Complutense selected and provided the books, as well as technical staff.  The overarching aim was to offer access to the university’s library heritage. It was also perceived as an important part of selling the Spanish language abroad – providing access to the vast number of Spanish speakers in the world.

The process started with an analysis of the collection to determine how many books were out of copyright. They then catalogued 70,000 books and established selection criteria – publication year and physical condition – and formulated workflows and logistics for digitisation. Using PDAs, for example, the selection team stored details of the physical condition of books against the book barcode.

As a result of this herculean effort, thousands of Complutense’s digitised books are already accessible in Google Books. It’s possible to navigate directly to the full text from the catalogue record. There are also links enabling users to buy the book. This is truly how to extract optimal value from materials that were formerly languishing in the library. And even in the short time that they’ve been available, 34% of the materials have already been used.

Google logoMeanwhile, Jason Hanley, one of Google’s partner managers who spoke immediately after Manuela, seemed anxious to dispel a number of myths about Google and its work with libraries. On the predominance of English language materials, he pointed out that of all Google’s library partners, 8 are outside the US – 2 being in Japan and the rest, such as Universidad Complutense, in Europe. He also believed the predominance of language, linguistics and literature over STEM subjects to be surprising – I’m not sure why.

The question and answer session at the end, involving both Manuela Palafox and Jason Hanley, may have inadvertently answered the question of Google’s motives in this. It’s not the library world that should be afraid of Google – it’s the competing search engines. Google’s longstanding mission – to organise the world’s information and make it universally accessible and useful – has clear benefits not only to library partners such as Universidad Complutense, but to the library world as a whole, and to bibliophiles like me. But Google will be imposing limits on the availability of digitised materials for indexing by other search engines for a certain (undefined in this session) period of time, although Hanley denied that Google was trying to be exclusive (which came across as being more than slightly defensive).

The session was a clear window into the aims and experiences of a library partner, and maybe into Google’s motives as well… As one speaker from the floor noted, what are the chances of any other search engine being able to compete fully with Google in the foreseeable future?.

Europeana: Think culture

EuropeanaAiming high is rarely the wrong thing to do, in my opinion, and Jonathan Purday’s presentation, at the Eurolis Seminar Doom or Boom of Europeana, a digital library offering a single, direct and multilingual interface to cross-domain European cultural artefacts certainly wasn’t short of lofty aims. Europeana isn’t just about making library resources available, it’s about breaking down the cultural institution-based silos right across the European cultural sector, and in the process it has created an exciting online resource for the public, researchers and teachers and learners in education.

It’s easy for British people to forget the risk that the Google Book Project will overshadow non-English artefacts in Europe, and this has been an important concern since at least 2005, when the European Commission launched its Digital Libraries initiative. Initiatives such as Europeana are, in Purday’s words “making available the intellectual record of other languages”. And it will also “harmonise digitisation practices across Europe”. All good stuff.

It was also great that Purday acknowledged that every search now begins with Google, and that if you don’t find material, you think it hasn’t been digitised or it doesn’t exist. I and a number of delegates were left wondering at the end of the session, though, whether the full text of content in Europeana will be exposed to Google, and if Purday could come back on that point, that would be useful.

It’s worth mentioning that every single speaker at the Eurolis seminar mentioned the need to consider copyright harmonisation and Purday was no exception, but he probably deployed the most powerful arguments to support this. We can’t digitise at the scale now technologically possible, he argued, unless we reconsider and harmonise copyright, he said, and that the risk was of creating a “20th century black hole”, whereby we will be unable to represent the published output of “the most documented century” and we will end up with a distorted picture of the past as a result.

I would urge people to take a look at Europeana. The search interface is available in 26 languages, and in the next 2 years they plan to be able to translate search terms on the fly (currently only the interface is translated). Purday demonstrated a search on Don Quixote, which not only came up with an impressive range of book editions, but also images inspired by the work, plus videos, including a 1956 news broadcast in which Salvador Dali recreates a vision of Don Quixote at Moulin de la Galette. Europeana holds metadata in the central index and takes the user back to the original site to look at the full artefact, so decentralised and collaborative in a sustainable way.

Europeana is currently attracting 15,000 users a day. Purday is concerned, though, that most people interested in the site are over the age of 45. He plans to address this by creating an API so users can put Europeana into their own web space, although in discussions afterwards, people wondered whether such a measure would succeed in engaging younger people.

Google Book Settlement will help stimulate eBook availability in libraries

books_logo So says former Google Book Search product manager Frances Haugen in her contribution to the debate on the September Library 2.0 Gang.

This month’s Gang was kicked off by Orion Pozo from NCSU, where they have rolled out dozens of Kindles and a couple of Sony Readers.  The comparative success of their Kindles ahead of the Sony Reader appears to be because of the simpler process of distributing purchased books across sets of readers and a broader selection of titles at a lower cost.  Currently users request books for the Kindle via an online selection form, then they are purchased and downloaded on to the devices which are then loaned out.  There were no restrictions on titles purchased and they have an approximate 50% split between fiction and non-fiction.

L2Gbanner144-plainThe Gang discussed the drivers that will eventually lead to the wide adoption of eBooks.  This included things like the emergence of open eBook standards, and the evolution of devices, other than dedicated readers, that can provide an acceptable reading experience.   Carl Grant shared his experience of starting a read on his Kindle and then picking it up from where he left off on his iPhone (as he joined his wife whilst shopping).

An obvious issue influencing the availability of eBooks is licensing and author and publisher rights.  This is where the Google Book Settlement comes in to play.  If it works out as she hopes, Frances predicts that over time this will facilitate broader availability of currently unavailable titles.  I paraphrase:

[From approx 26:50] Institutional subscriptions will become available on the 10M books that Google has scanned so far.  Imagine in the future a user with a reader that accepts open formats will be able to get access to the books this institutional license would provide.  Imagine school children having access to 10M books that their library subscribe to, instead of having to formally request one-off books to be added to their device.

[From approx 44:50] There are a huge number of books that are no longer commercially available in the US, for several reasons.  If the rights holders of those books do not opt-out, they will become available for people to purchase access to.  One of the interesting things about the way the settlement is set-up is that you will be able to purchase access either directly or through an institutional subscription.  What is neat is that cycle will put a check on prices as prices for individual books are based upon the demand for the books. So less poplar books will cost less…  So if the price of the institutional subscription ever gets too high libraries can decide to buy one-offs of these books.   I think that whole economic mechanism will substantially increase access to books.

The Gang were in agreement that eBooks will soon overtake paper ones as the de facto delivery format.  It is just a question of how soon.  Some believe that this will be much more rapid than many librarians expect.  A challenge for librarians to take their services in to this eReading world. 

Google Book Scanning Project – Issues and Updates

google-logoLast night I listened to another Educause webinar – something that is developing into a (good) habit. This week’s was entitled The Google Book Scanning Project – Issues and Updates, and featured presentations and discussion between Dan Clancy, Engineering Director of Google Book Search, and Jonathan Band from the Library Copyright Alliance.

Even though the current negotiations are US-specific, it’s still a good idea for librarians everywhere to keep themselves up-to-date on progress on this area. This webinar provides a useful overview of the project, but if you haven’t got a full hour to spare, a recent article written by William Skidelsky in The Observer – Google’s plan for world’s biggest online library: philanthropy or act of piracy? – should also do the job.

So I’ll leave it to those two sources to cover the basics. However, there are a number of concepts that are important to understand in order to follow the debate between the two sides, which is what this blog posting is really about.

First of all, Google is categorising all the books it scans into one of the following:
a. Public domain – defined as having been published before 1923.
b. Books published after 1923, but which are either out of print or orphaned works (around 75% of all books scanned).
c. Books still in print.

Secondly, Google is planning to offer a number of different diverse access models, the most noteworthy being:
a. Preview uses
b. Online consumer access – enabling users to buy online access to individual works under a pricing regime set by either the rightsholder or Google.
c. Institutional subscription – on a FTE basis, for HEIs and corporations
d. Public Access Terminal – one free terminal per US public or university library.

Thirdly, an independent Books Rights Registry (no website as yet) will be set up to represent rightsholders and to collect and distribute revenues as well as resolve disputes.

Well that covers a lot of Dan Clancy’s presentation, although it’s worth mentioning in passing that Clancy does come across as being genuinely philanthropic, as the Observer article also noted.

So now let’s move onto Jonathan Band, who was there really to cover the pros and cons of the project as it currently stands.

Band had many good things to say about the Google Book Settlement, painting a rosy picture of where we’ll be if the Settlement is approved. Firstly, of course, Google will be able to continue scanning books into its search index. Notable benefits for users include free access to users to full-text through public access service terminals, and the ability to purchase access to out of print books for relatively low cost. Meanwhile, institutions will be able to purchase access to the full text of millions of books, and those that are participating in the project will receive digital copies of their collections.

As Band said, all in all there’s a lot to like.

And yet the project has generated considerable controversy. Why is this?

One frequently made argument is the absence of competition for what is bound to become an essential facility. Google has already scanned 10 million books in 5 years so it has a huge competitive advantage. So here is a situation in which there is enormous demand, yet there is no other supplier, so there is a risk of a cost-prohibitive subscription which might undermine equity of access, privacy and intellectual freedom.

The business model is also contentious. Together, Google and the Books Rights Registry (with arbitration if necessary) will set the price of the institutional subscription. Google’s objectives in pricing are the realisation of revenues at market rates and of broad access to books. The parameters for pricing include pricing of “similar products and services”, and Band is concerned that if eJournal subscriptions are used as a benchmark, then the subscription could be cost-prohibitive for many institutions.

Only Google’s library partners have the right to a separate price negotiation route. And even then, refund is limited to Google’s share (37% of price).

For Clancy, the solution is that rather than ask the court to reject the Settlement, we should ask the court to closely supervise the interpretation and implementation of the settlement, given that this is a natural monopoly needing regulation. Brand is also anxious to ensure diverse composition of the Book Rights Registry, encompassing author representation in particular.

Clancy countered this by emphasising that Google cares deeply about the pricing, and is making this investment because it believes in broad access; a limited access project will be inconsistent with their vision. Clancy compared the planned price of a typical book under the terms of the Settlement with the price of a journal article, which can cost around $30. To me this seemed like a fudge. The original argument that Band made was around the cost of the institutional subscription, so why didn’t Clancy use the price of an eJournal subscription as a comparator? He also argued, though, that the vast majority of books will be cheaper than ILLs.

Clancy didn’t touch the issue of competition, emphasising customer choice instead i.e. libraries can decide that the subscription is too expensive and instead opt for free services. Again, this lacked conviction. No library worth its salt would build its collection on such a restrictive basis. He did mention the lack of competition and choice in the eJournal marketplace though.

He also dismissed the suggestion that people will get rid of their physical books as seeming stupid. Actually this seemed strange, as Band hadn’t mentioned that argument.

The killer argument for me was made by Band towards the end of the webinar. He argued that we all want to trust Google. The Settlement is fundamentally desirable. And the people who are at Google right now seem eminently trustworthy. However, ownership can change, and that is why some degree of quasi-regulation is necessary. Clancy could only reply by saying that Google’s library partners (i.e. only the partners and not libraries as a whole) would have the right to arbitrate with Google if they felt the pricing was unfair.

Niche Print on Demand services on the rise

university-of-michiganToday in the Times Higher Education (THE), Matthew Reisz reports on the growth of niche print on demand (POD) services offered by academic libraries and university presses in both the UK and the US. While the Google Book Settlement moves through its long and laborious negotiation process, a small handful of libraries have taken the initiative and are making an increasing number of books available via print on demand.

Probably the most well-known of these is University of Michigan’s growing Michigan Historical Reprint Series, which recently announced the availability of 400,000 additional titles. But the THE  article also highlights similar developments at Cornell University Library and Cambridge University Press.

These developments surely deserve our congratulations. They have succeeded in generating a new revenue stream, which, as all librarians know, is easier said than done, especially in what is now a very risk-averse climate. Furthermore, any initiative that broadens availability of long tail publications has got to be a good thing. And lastly, whatever we think of the Google Book Settlement, a bit of healthy competition can only be a good thing for all parties.