Panlibus Blog

Archive for March, 2005

Yet more OpenSearch discussion …….

The release and subsequent discussions about Amazon A9’s OpenSearch protocol has identified some discussion points around what type and sophistication of search capability should be offered by Library and Information systems.

The latest shots in this coming from Lorcan Dempsey & Thom Hickey at OCLC, both referencing their colleague Ralph LeVan .

To caricature the situation:

We in one camp, have the information scientists and librarians extolling the virtues of a powerful flexible search systems allowing the user to describe in the finest detail, down to an individual part of a Marc tag, what they are searching for. Then being able to combine that with other equally detailed search elements, limited by many things such as language, format, dates, and author’s inside leg measurement. [I did say it was a caricature]

In the other camp we have the proletariat of users who find putting more than two words in to an Amazoogle prompt a bit of a strain.

The first group delight in a search screen with more prompts than you can shake a stick at, the rest have never clicked an Advanced Search link in their life.

So towards which group do the Library/Information system developers and suppliers concentrate their efforts, and develop protocols to support? I contend that we need to service both these communities, as fully as possible. Without the former there would be far less stuff catalogued for the latter group to reliably search for and find.

Thom questions if there is a middle ground between SRU and OpenSearch. I think the answer is that there is something between these two that is worth discussing. Whether it is in the middle, I’m not so sure. Ralph commented, and I replied in more detail than here, on one of my previous blogs on the subject.

Ralph has offered to help develop guidelines on how to make an SRU & OpenSearch compatible solution to emerge. His experience around Z39.50, SRU/W, and metasearch will be invaluable towards this. I am also happy to get involved in such discussions, maybe coming at it from the other end by wearing a hat bearing the legend “Unadventurous member of the Internet Proletariat”

“It’s not a system that would impress a librarian, but…. “

Folksonomies remain in the news with Jack Schofield in last Thursday’s Guardian reporting back from the Emerging Technology conference in San Diego, California that “Folksonomy was the big, bad, buzzword”

Schofield asserts that folksonomies, as used by sites such as Flickr for sharing photos or del.icio.us for web links, would not “impress a librarian”. But “they are also important because this is probably the only viable way of tagging billions of items on the net. No one is going to hire millions of trained librarians to do the job”.

It’s not that librarians haven’t tried. In 1998 OCLC launched the CORC project turning its vast cataloguing expertise to “taming the Web” with the prospect of a catalogue of Web content on the scale of OCLC’s huge bibliographic database, the WorldCat. “Both full USMARC cataloguing and an enhanced Dublin Core metadata mode will be used” it was announced. More modestly at Talis we have Talis List, a web based reading list system that allows academics and/or librarians to “harvest” (in a manner not unlike delicious) and categorise web sites very simply and add them to a course “resource list” for students.

It’s not just “tagging” technology that is challenging librarians. As regular readers of this blog will know, Talis has been engaged in a project around RSS technology that has now expanded to include OpenSearch. Coincidentally OpenSearch was also featured at the Emerging Technology conference by Amazon’s Jeff Bezos. Richard Wallis has discussed our take on RSS and OpenSearch in more detail including his Talis Prism (library catalogue) OpenSearch proof of concept.

So coming back to the first point, librarians may not be impressed with what seem to be simplistic approaches to cataloguing, classification or search. We know the problems are complex. The point though is that we can see our comfortable, complex, feature rich but domain specific technologies and standards like MARC, Z39.50 etc being challenged from outside the domain by companies with a bigger problem to solve: –Web 2.0. That’s why, at Talis, we take them seriously and get involved.

Udell wooed by OpenSearch

To be honest, I wasn’t even planning to enable RSS subscription to InfoWorld search. It just came for free. When that happens, it’s a sign that things are deeply right.

Jon Udell on InfoWorld has a play with A9′s OpenSearch.

When I heard about OpenSearch, I wondered how hard it would be to integrate my new view of InfoWorld search as a “column” in A9. As I soon learned, it’s almost trivial.

I know I’ve been banging on about OpenSearch a bit since it was announced, but in the same way that it’s parent RSS has rapidly changed the way we find out about things happening, I get the feeling that A9 are going to get the credit for instigating a rapid change the way searching hangs together.

Don’t get me wrong, OpenSearch is a long way short of a search utopia but with a little bit of evolution [a couple of 1.x updates and then a version 2.0] I believe it stands a chance of becoming the utility search protocol that could knit together much of the web’s search nodes in to a cohesive unit.

Time will tell as to the quality of my crystal ball skills. Nevertheless, reading between his lines, Jon Udell seems to agree that there is something in this worth watching.

Podcasting from Open Stacks

I’ve just been listening to Greg Schwartz’s of Open Stacks Podcast [mp3] on his experiences at the CIL show from last week.

Apart from him being deserving of our sympathy, he was obviously suffering from ‘Man Flu‘ throughout the event, his insight on what he saw was interesting as well. Not just the presentations but the buzz around the show, and his observations on the communities of users who’s radar we need to get the Library to show up on.

The main reason I’m blogging this is because its great to see libraries entering the world of Podcasting. Since the coming of Podcasting and rewritable CD’s, my car journey home from the office has been a far more informative, interesting, and entertaining experience. (Yes I’m one of the few on the planet without an iPod)

People like Greg, and the guys from IT Conversations [btw I wonder what has happened to the mostly excellent Gillmor Gang, they appear to have fallen off the planet] have opened my eyes & mind to loads of things relevant, but not necessarily directly connected, to what we all do and think about.

Dave Errington, Talis CEO, in his keynote at the Talis Insight Conference last November predicted that Podcasting, along with IM, RSS, Blogging, etc. will start to gain greater influence in our world. Its great to see his prediction coming true so soon.

Anyway back to the cause of this posting. Well done Greg keep it up. Hope you get to Internet Librarian in October and Podcast from there, and maybe we get to meet up.

LISFeeds.com

Panlibus now feeds into LISFeeds.com.

A Talis demonstrator for the new evolution of RSS – A9′s OpenSearch

You can always tell when a technology has become established, when it starts being used for something else. How many remember that http was a protocol just designed to shift hypertext around a network, or that SGML with its offsprings of HTML & XML, was just something to make typesetters life easier.

Well with Amazon A9′s announcement of OpenSearch RSS has reached that stage in its evolution. Whatever the arguments about what the letters RSS actually stand for [my favorite and the one in the specification, is still Really Simple Syndication], its use up until now has been about providing alerts or newsfeeds of events to users.

The few variations on this such as Podcasting, and our own Personalised RSS, are still basically all about alerting you of events. Even MSN’s RSS Search alerts you that a search is now returning some new results.

So how has OpenSearch evolved beyond the original concept of RSS?

Firstly the starting point. RSS is not only really simple but it is established. If you want to do anything with it from a development point of view, there is sufficient stuff out there for you not to have to bother with any of the low-level stuff. If you are in the Java world, just pull down Rome and away you go, similarly in the .net universe.

Secondly, problems its implementation components solve are very similar to other problems out there. This is where OpenSearch gets in to the story. Newsfeeds provide lists of genericly described items in a results set. For a search engine to return an answer it needs to provide lists of generically described items in a results set.

So with the launch of OpenSearch A9 have created an instant comunity for search, that most engines would want to be part of, and have made it very easy to join.

The best way to prove exactly how easy, is to do it.

And over the last couple of days, that is what I have done. Talis now have a prototype OpenSearch interface to the demonstration Prism Library OPAC.

The following link takes you to the OpenSearch standard description document: http://demo.talis.com:6080/TalisPrism/OpenSearch.xml Its contents describe the OpenSearch as provided by Prism, and the way to access it.

The ‘Url’ element contains the Url used to access to the service, encoded in the OpenSearch Query Syntax. By replacing the ‘{}’ encoded tags with values A9′s service can construct requests to search and then page through sets of results, thus:

* http://demo.talis.com:6080/TalisPrism/URLServer?Service=OpenSearch&keyword=abbey&startIndex=1&itemsPerPage=10
* http://demo.talis.com:6080/TalisPrism/URLServer?Service=OpenSearch&keyword=abbey&startIndex=10&itemsPerPage=10
* http://demo.talis.com:6080/TalisPrism/URLServer?Service=OpenSearch&keyword=abbey&startIndex=20&itemsPerPage=10

Not rocket science as you can see. So it won’t be long before A9 is not the only OpenSearch client on the block

If you have been clicking on these links in your browser you will only be seeing XML. Try pasting the links in to your favorite RSS reader to see the effect.

Better still try it from A9 [You will have to register & login, but its worth it]. Enter the description url [http://demo.talis.com:6080/TalisPrism/OpenSearch.xml] in to their Create New column page and press load.

You will then see the description in a more readable form, and more importantly a preview of what the results will look like in A9 is loaded at the right of the page.

The exciting bit, from the potential user’s point of view, is that by clicking on the title of a result you are taken to the detail for the result, displayed in the Prism interface. If you were in a real library you could then go on and place a reservation request for the item, or discover which branches the item was held at, etc. As an aside this functionality is provided by yet another technology that is spreading its wings beyond its original concept, OpenUrl. But that’s another story.

So where next?

* The Talis Prim OpenSearch interface, that will move in to project Bluebird which is all about communicating with users, using technologies like RSS, SMS, etc.
* OpenSearch I fully expect to grow beyond A9. It has a great opportunity to become a de facto simple search interface. With a bit of help from the library community, there is no reason why it couldn’t be built upon to become a suitable alternative for some of our search protocols, that are not so simple. [Do I hear a little cheer from the developers who have ever tried to get their head around implementing Z39.50]

Changing role of public libraries

I’ve come across some research that deserves a wider hearing – an article published in the Journal of Documentation Vol 60, No.6, 2004, p.632-652 by Douglas Grindlay and Anne Morris has researched the causes of declining borrowing in UK libraries.

The strong conclusion is that increasing personal affluence is the single direct cause of decreasing issue figures.

This should be factored in to thinking about the future role of libraries – the mandate for their existence is changing away from issuing books.

When will XML replace MARC?

This is the subject line of a thread that’s been running on the XML4LIB email list over the last couple of days. The question has been around almost as long as XML itself, but MARC is still very much with us.

Several writers in the thread argue that the question is wrong: XML cannot replace MARC because they are different things. For me, though, that confuses three different components in the MARC world: the MARC standard (ISO 2709), the different flavours such as MARC 21, which are like application profiles defining content designators and lists of values, and content standards, predominantly Anglo-American Cataloguing Rules (AACR). ISO 2709 is a kind of extensible markup language designed for data exchange and so could be replaced by XML, but it can only be done effectively when the other two components are re-aligned to modern requirements and to the flexible power of XML.

Not surprisingly, the Library of Congress’ MARCXML framework is discussed in the thread. In a strict sense, it replaces MARC, i.e. ISO 2709, with XML. But it deliberately emulates precisely the restrictive structural characteristics of MARC, enabling round trip no loss conversion, to allow MARC data to be manipulated and transformed to other formats and contexts, using modern XML tools. Undoubtedly, this has been a tonic for the geriatric MARC, or (switching metaphors) it is a useful bridge or stopgap between the the old world of MARC and a new world, as yet not fully realised, based on XML. It allows a little more value to be squeezed from the huge investment of systems and data in MARC.

Some writers in the thread, however, criticise MARCXML for not being the panacea that it makes no claim to be. Its structure means that it is not well suited to XML indexing systems so performance is sub-optimal and, more importantly, it is not capable of articulating metadata in ways that are now required. Several writers call not only for better articulation of the metadata but also for a different set of metadata elements, more suited to modern requirements for search, navigation, data presentation and interchange between heterogeneous environments. Peter Binkley (University of Alberta) puts it well:

… we need metadata to aid not just searching but also clustering, linking to relevant external resources, etc. – all the things we can do in the new environment to enhance search results and other forms of access. The XML tools for using web services etc. are great and will get better much faster than anything MARC-based.

Here, though, we move into the territory of application profiles and content rules. As several other writers in the thread point out, an area of activity that could be leading the way to a full replacement of MARC is that based on the Functional Requirements for Bibliographic Records (FRBR). In the publishing world, it provided the conceptual model for the bibliographic elements of the Indecs framework, which led to the development of the ONIX format. Now, its principles are being built into the next edition of the Anglo-American Cataloguing Rules, AACR3. Although AACR3 will be capable of expression in MARC 21, it will push MARC’s capabilities closer to the limits. MARC records have been ‘FRBRised’ in a number of different initiatives with some success, but the work has clearly discovered shortcomings in the MARC format.

MARC will not be replaced by a single, dominant and self-contained metadata format. We can no longer even scope the contents of a ‘self-contained’ record. Increasingly, we require and have the ability to connect pieces of content dynamically and unrestrictedly, as we move towards the semantic web. The ‘replacement’ will be a metadata infrastructure. This is well argued by Roy Tennant in his article A bibliographic metadata infrastructure for the 21st century, summed up in his catchphrase ‘I never metadata I didn’t like’.

Dick Miller and his colleagues at the Lane Medical Library, Stanford University Medical Center, have done a great deal of impressive work to show the way forward for bibliographic data, in their development of XOBIS. A quote from his post to the thread makes a fitting end:

Some may think that MARC is robust since so many ILS systems use it, but ILS systems themselves are endangered, not able to respond with agility to changing technologies. For libraries to flourish, bibliographic data needs to be flexibly deployable in broader environments lest we will gradually lose relevance.

Public Library Impact Measures Published

In a positive move towards defining a common purpose for libraries and one which delivers to Government priorities, the Public Library Impact Measures were launched this week. Details can be found in the MLA web site.

Parliamentary review of Public Libraries

The UK Parliamentary Select Committee on Public Libraries has now published its findings (pdf).
The background evidence is available in a separate report (pdf).

Whilst giving credit where it is due, it identifies the patchy nature of library provision in the UK and lack of clear focus and priorities across all services:

We regard a situation in which core performance indicators, and gross throughput, are falling—but overall costs are rising—as a signal of a service in distress.

Our key recommendations are designed to focus attention on libraries’ fundamental role in promoting reading and we seek to distinguish clearly between core functions and desirable add-ons (prioritising resources in favour of the former). There need to be far stronger links between national library standards (which themselves need improving) and effective mechanisms to encourage and enable library services to meet, if not surpass, them.

I’m sure this will provoke considerable debate!