Panlibus Blog

Author Archive

Panlibus now feeds into

When will XML replace MARC?

This is the subject line of a thread that’s been running on the XML4LIB email list over the last couple of days. The question has been around almost as long as XML itself, but MARC is still very much with us.

Several writers in the thread argue that the question is wrong: XML cannot replace MARC because they are different things. For me, though, that confuses three different components in the MARC world: the MARC standard (ISO 2709), the different flavours such as MARC 21, which are like application profiles defining content designators and lists of values, and content standards, predominantly Anglo-American Cataloguing Rules (AACR). ISO 2709 is a kind of extensible markup language designed for data exchange and so could be replaced by XML, but it can only be done effectively when the other two components are re-aligned to modern requirements and to the flexible power of XML.

Not surprisingly, the Library of Congress’ MARCXML framework is discussed in the thread. In a strict sense, it replaces MARC, i.e. ISO 2709, with XML. But it deliberately emulates precisely the restrictive structural characteristics of MARC, enabling round trip no loss conversion, to allow MARC data to be manipulated and transformed to other formats and contexts, using modern XML tools. Undoubtedly, this has been a tonic for the geriatric MARC, or (switching metaphors) it is a useful bridge or stopgap between the the old world of MARC and a new world, as yet not fully realised, based on XML. It allows a little more value to be squeezed from the huge investment of systems and data in MARC.

Some writers in the thread, however, criticise MARCXML for not being the panacea that it makes no claim to be. Its structure means that it is not well suited to XML indexing systems so performance is sub-optimal and, more importantly, it is not capable of articulating metadata in ways that are now required. Several writers call not only for better articulation of the metadata but also for a different set of metadata elements, more suited to modern requirements for search, navigation, data presentation and interchange between heterogeneous environments. Peter Binkley (University of Alberta) puts it well:

… we need metadata to aid not just searching but also clustering, linking to relevant external resources, etc. – all the things we can do in the new environment to enhance search results and other forms of access. The XML tools for using web services etc. are great and will get better much faster than anything MARC-based.

Here, though, we move into the territory of application profiles and content rules. As several other writers in the thread point out, an area of activity that could be leading the way to a full replacement of MARC is that based on the Functional Requirements for Bibliographic Records (FRBR). In the publishing world, it provided the conceptual model for the bibliographic elements of the Indecs framework, which led to the development of the ONIX format. Now, its principles are being built into the next edition of the Anglo-American Cataloguing Rules, AACR3. Although AACR3 will be capable of expression in MARC 21, it will push MARC’s capabilities closer to the limits. MARC records have been ‘FRBRised’ in a number of different initiatives with some success, but the work has clearly discovered shortcomings in the MARC format.

MARC will not be replaced by a single, dominant and self-contained metadata format. We can no longer even scope the contents of a ‘self-contained’ record. Increasingly, we require and have the ability to connect pieces of content dynamically and unrestrictedly, as we move towards the semantic web. The ‘replacement’ will be a metadata infrastructure. This is well argued by Roy Tennant in his article A bibliographic metadata infrastructure for the 21st century, summed up in his catchphrase ‘I never metadata I didn’t like’.

Dick Miller and his colleagues at the Lane Medical Library, Stanford University Medical Center, have done a great deal of impressive work to show the way forward for bibliographic data, in their development of XOBIS. A quote from his post to the thread makes a fitting end:

Some may think that MARC is robust since so many ILS systems use it, but ILS systems themselves are endangered, not able to respond with agility to changing technologies. For libraries to flourish, bibliographic data needs to be flexibly deployable in broader environments lest we will gradually lose relevance.

Mobile and PDA technologies and their future use in education

This is the title of the latest JISC Techwatch report, published in November 2004, whch I’ve just dipped into. Here’s their overview:

In recent years there has been a phenomenal growth in the number and technical sophistication of what can loosely be termed ‘mobile devices’ such as PDAs, mobile phones and media players. Increasingly these devices are also internet-enabled. This JISC report reviews the current state of the art, explores the potential uses within education and discusses some of the trends in technological development such as wireless networking, device convergence and ‘always-on’ connectivity.

An email update from one of the authors, via the Techwatch email list, last week, points out that there remains considerable uncertainty (‘fog’) around fast wireless access technologies, but the following conclusion serves to emphasise, for me, the need for libraries and their systems suppliers to be focusing on delivering data and services to these technologies:

… widespread adoption by students and staff of always-on mobile devices will partly be driven by the development of wireless broadband networks that can deliver the Internet to these devices. As the competition to deliver high speeds through the various technology paths increases so the likely time to market for low cost consumer solutions is likely to fall. As currently planned by manufacturers this kind of high speed access should be relatively normal by the end of the decade.

Although this has an academic library perspective, it will surely apply equally to actual and potential users of public libraries because this is about general consumer technology. Once again it’s a reminder to take the library to the users, use the technology that they use (redefining the meaning of ‘mobile’ for libraries!), or be ignored.

ALA, Boston, day 1

As usual, it has been a varied, busy and enjoyable first day at ALA here in Boston.

I kicked off by participating in a small discussion group hosted by OCLC on implementing the concepts of the Functional Requirements for Bibliographic Records, commonly known as FRBR (pronounced ferber). By exploiting the relationships between our current manifestation level records, search results can be grouped and presented to users in more meaningful ways, as well as retrieving relevant results that otherwise would be missed and eliminating irrelevant results. There are good examples already in the VTLS system and RedLightGreen, and OCLC’s FictionFinder and xISBN services. But there is scope for more, such as grouping and filtering results according to the user’s preferences. Cataloguing efficiencies could be achieved and quality and consistency of cataloguing improved by sharing records at the Work level. This is all getting closer to becoming a reality, with the changes to content rules coming through in AACR3 and with XML-based technologies.

Standards was the theme of my other two sessions. ‘Codified Innovations: Data Standards and their Useful Applications’ focused on standards relating to the control of e-journals. This is a field that is suffering from a combination of a lack of standardisation and a lack of implementation of available standards. Frieda Rosenberg and Diane Hillman have done some interesting work recently on holdings data, where a lack of standardisation is, for example, impeding the quality of results from link resolvers. Their work also called on FRBR concepts: An approach to Serials with FRBR in Mind. We also had an update on the revision ISSN, which has had a very troublesome time finding its way through deeply conflicting interests. It seems that consensus has formed around re-affirming the current definition, with the expectation or hope that the process of doing this will lead to publishers being more consistent in applying the ISSN assignment rules. There will also be a new, title-level ISSN to support library requirements and it is hoped that a place in MARC field 024 can be defined for it.

Finally, the Automation Vendor Information Advisory Committee (AVIAC) explored the issues for systems vendors aound the implementation of 13-digit ISBNs. Those present seemed to have a fair grasp of the implications and we heard some useful background information from a member of the ISO ISBN Revision Committee. A key point for me was that library system vendors should not ignore the possibility that their customers might want to use the 14-digit Global Trade Item Number (GTIN), where the extra digit specifies an aggregation of a particular product such as a carton of the new Harry Potter. More on this when I give my presentation on Monday.

Open Amazon – lesson for libraries?

There’s a fascinating article, ‘Amazon: Giving Away the Store‘, in the January issue of Technology Review.

It describes how has opened up access to the riches of its product database via web services, allowing developers anywhere and everywhere to grab data and re-use it to enhance their own sites. Sales have to be routed through Amazon, but the satellite site gets a commission. This exposes Amazon to an even wider potential market whilst outsourcing the development cost and creativity (as well as some of the profit).

Apart from the possibilities for libraries to use Amazon web services, which has been happening for some time, there is a clear parallel here with what libraries and their systems need to be doing with their own content and services: separating presentation from business logic and content so that they can offer their content and services beyond the OPAC in the places where the users are and presented in ways that are appropriate to those places. This renews the old library adage, ‘get the stuff to the chaps.’

Much has been written about the chaps being mostly at Google and Amazon and, more generally, in the ‘open’ web. One example of the open Web that uses Amazon web services is, a site that monitors books being discussed in blogs. In addition to the link to Amazon, there should also be an option to find the book in the user’s preferred library. This is provided at the independent Amazon Light but, for a global audience, a single list of libraries all over the world is a crude and ineffective mechanism. What it needs is an embedded service linking to a maintained directory of libraries, providing robust links and a good method to enable the user to select their prefered library from the huge number available.

A vision for the E-Learning Framework

The E-Learning Framework is a major international initiative that has important implications for libraries. The UK part of it is the JISC e-Learning Programme supported by CETIS. Where could the international e-Learning Framework be in five years’ time? Some of the key international partners have outlined their vision.

Dan Rehak of the Learning Systems Architecture Lab, Carnegie Mellon University in the US

… hopes that in five years time there will be sufficient web service alternatives in each of the ELF service definitions or ‘bricks’ to allow institutions to choose the services most relevant to them and their institutional e-learning infrastructure. We mustn’t lose sight of the ultimate aim which is better learning opportunities for students.

Kerry Blinco and Neil McLean of the Department of Education, Science and Training (DEST) in Australia

… have an air of confidence that the service oriented approach will succeed. That confidence is probably built on the experiences of working with the Tasmanian Education Department who have successfully built a service based education environment. The Learning Architecture Project (LeAP) is delivering a number of interoperable online applications to enhance teaching and learning in 218 schools and colleges across Tasmania. …

Neil thinks that the framework is now at the cottage industry phase where academics, software developers and policy makers are involved in its development. In five years time Neil predicts that open source web services will have taken off and there will be a proliferation of teaching applications for people to use. At this stage it is important to keep both academics and software developers involved by using an iterative development process for the ELF that everyone feels that they can be part of.

Neil McLean co-authored, with Clifford Lynch of the Coalition for Networked Information, a key white paper on Interoperability between library information services and learning environments.

JISC Watching the Semantic Web

The JISC Technology and Standards Watch has commissioned a report on Semantic Web Technologies by Dr. Brian Matthews of CCLRC Rutherford Appleton Laboratory and Deputy Manager of the UK and Ireland Office of the W3C. ‘This JISC report will discuss the current state of the art of the Semantic Web, how it may impact the UK Higher and Further Education sectors, and how it may develop in the next few years.’

They’ve also commissioned another interesting report, mentioned on the same page: Future location-based experiences by Professor Steve Benford. This is about digital content adapted to the user’s location and delivered to portable or wearable devices through wireless communications. The brief doesn’t mention libraries but it set me imagining finding out, through my PDA, the nearest library with an available copy of a book that I’m looking for.

Metasearching: a new approach

The latest issue of the Library Journal has an article, ‘Moving Beyond Metasearching: Are Wrappers the Next Big Thing?‘, about a $2 million project ‘to deliver electronic content no matter where the search is conducted’. The functional ideas are interesting and have the crucial benefit of being easy, intuitive, to use.

But I need to find out about ‘wrappers’ before I can understand the technology behind it. I must be missing something – from the description given in the article, it sounds like the XML equivalent of html screen-scraping, which the NISO Metasearch Initiative is seeking to get away from. I particularly like the sentence: ‘All results from the same vendor are returned in the same layout wherever you search.’ It must be true if they’re throwing $2 million at it, right?

FRBR News and Prototype Catalogue

A new FRBR Prototype Application has been made available on the Web. It is an experimental adaptation of the CDS/ISIS system. The intention is to release the software modules as either freeware or Open Source. Some brief information about the prototype with a few links is available.

The interface simply presents the user with search options based on the primary FRBR entities and the database is very small, but it demonstrates the principles and models the relations between all the entities.

A preprint version of a technical paper is available; it’s published in Cataloguing & Classification Quarterly vol. 39 no. 3-4 2004. This is devoted to FRBR and edited by Patrick Le Bœuf of the Bibliothèque nationale de France, who is at the hub of IFLA’s work on FRBR. The title of the CCQ double issue is FRBR: hype, or cure-all? There has certainly been a lot of hype about it and Patrick is the first to point out that it is an imperfect conceptual model, but it does seem to offer possibilities to improve the performance of catalogues for key types of material such as literary works and music to better fulfil Cutter’s principles.

Scanning the contents list for CCQ 39, 3-4, there are many fascinating articles that I look forward to reading from key players in cataloguing research, but there doesn’t seem to be a user study. Given the well-documented user preference for Google’s simplicity, I would have thought that those who are investing in the application of FRBR concepts would want to know whether their systems are going to appeal to their intended user-base and how best to design their user interfaces to do so.

RSS – good article

RSS offers libraries the potential to deliver services in new ways and to create completely new services, with relatively little effort or cost. What is RSS and how can it serve libraries? [pdf] is a 14 page thorough introduction to RSS, how it works and how it can be applied, with a full section on its potential for libraries. Well worth a read by anyone wanting to be inspired to exploit RSS for library services.
Found via blogwithoutalibrary.