Panlibus Blog

Archive for December, 2008

Library of Congress forces LCSH Linked Data site to shut down

Back in May I was among others who welcomed the initiative by, Talking with Talis interviewee, Ed Summers in setting up lcsh.info.  This site was set up by Ed to demonstrate how the Library of Congress Subject Headings could be represented as a Semantic Web application using SKOS.

In the intervening months many including myself used Ed’s work as a pointer to how useful publicly available data could, with the use of open Linked Data principles, become a valuable part of sites and services across the globe.   For instance another Talking with Talis interviewee Martin Malmsten, from the Royal Library of Sweden, almost immediately made use of the links to the LCSH data.  Ed went on to get lots of feedback, and wrote a paper which he then presented at DC2008.

It is therefore with great disappointment that I read this on the lcsh.info site the other day:

On December 18th I was asked to shut off lcsh.info by the Library of Congress. As an LC employee I really did not have much choice other than to comply.

As a LC employee he was put in an untenable position when they obviously decided that they didn’t like this useful service based on publicly available data being delivered from a domain that doesn’t end in loc.gov.  I wonder if there are any other Linked Data enthusiasts, not held back by who their employer is, who would pick up from where he left off?

Ed goes on to say:

It was always my intention for concept URIs at lcsh.info to be cool. I advertised the service as ‘experimental’ and indicated it was going to hopefully inform the development of a similar continually updated service at LC where I work. …  My thought was I could leave the service running until there was something similar at LC that I could redirect the concept URIs to. After a year or two when people had rewritten there data to point at loc.gov I could retire lcsh.info. I never imagined I would be asked by LC to take it down.

LOC should have listened to Ed in the first place and taken the high ground in leading the work in to creating a semantic web of data with their valuable publicly available data.  At the end of his post Ed hints that LC is still considering running a service like lcsh.info at loc.gov, but it’s not there yet.  Why-o-why did they not learn from his work and ride the wave of introducing their own service based on his great initiative.  Instead they present to the world a short-termist not-invented-here attitude, that reminds me of other well established leviathans of the world of library metadata.

Let’s hope that Ed’s hint is correct and we will soon be able to welcome the release of Open Linked LCSH and other Data from the electronic portals of the LofC.

Traffic Squad Police (LOC) image published in the The Library of Congress’ photostream on Flickr

OCLC Record Sharing, Yogurt, and Copyright

Karen Schneider aka. the Free Range Librarian has produced her reflection and thoughts on the OCLC change from record sharing guidelines to policy saga.   Like many others she has used an analogy to describe how OCLC are trying to protect their interests. 

OCLC has made a policy clarification that in the short run is a perfectly reasonable claim intended to protect the interests of its members and the body of data accumulated under its aegis. In this intellectual-property model, you are a yogurt-maker. I am your distributor. I charge you for this service, and if someone else tries to take your yogurt from my warehouses or steal it from the trucks that deliver your yogurt to stores, I set my dogs after them.

As Rob Styles points out in the comments to her post, there is a fundamental difference between yogurt and records:

We allow distributors of yoghurt to “release the dogs” on thieves because yoghurt is a rival good. That means if I take it then you are left without it. Records are a digital, and therefore non-rival good. Everyone can have a copy and OCLC would still have them. They don’t get “used up”!

This is one of many examples where commentators are thrashing around trying to explain and/or predict the effect of the proposed licensing policy without having any clarity as to the intellectual property rights status of the records in question.

Following my Talking with Talis podcast with Karen Calhoun and Roy Tenant, I raised a question copied on the Metalogue blog which was intended to help clarify this situation:

What does the 1982 OCLC Copyright on WorldCat apply to – the records, schema and organization of records or the records themselves? And was that Copyright not superseded by the 1991 Feist Publications v Rural Telephone Service ruling? Does OCLC hold that individual records qualify for Copyright and if so what "originality" or "creativity" qualifies them for that Copyright protection?

Nearly a month ago when this question was posted Karen said she “will continue to research this”.  No doubt she has had to consult her legal department on this, but I am a little surprised at the time it is taking for a response.  One would have thought that such an establishment of the copyright status of WorldCat and its records would have been at the core of the licence they produced from the start and therefore readily available.

Dave Pattern challenges libraries to open their goldmine of data

The simple title of Dave’s recent blog post ‘Free book usage data from the University of Huddersfield’ hides the significance of what he is announcing.

I’m very proud to announce that Library Services at the University of Huddersfield has just done something that would have perhaps been unthinkable a few years ago: we’ve just released a major portion of our book circulation and recommendation data under an Open Data Commons/CC0 licence. In total, there’s data for over 80,000 titles derived from a pool of just under 3 million circulation transactions spanning a 13 year period.

13 years worth of library circulation data opened up for anyone to use – he is right about it being unthinkable a few years ago.  I suggest that for many it is probably still unthinkable now, to whom I would ask the question why not?

In isolation the University of Huddersfield’s data may only be of limited use but if others did the same, the potential for trend analysis, and the ability to offer recommendations and who-borrowed-this-borrowed-that  services, could be significant.

If you have 14 minutes to spend I would recommend viewing Dave’s slidecast from the recent TILE project meeting, where he announced this, so you can see how he uses this data to add value to the Huddersfield University search experience..

Patrick Murry-John picked up on Dave’s announcement and within a couple of days has produced an RDF based view of this data – I recommend you download the Tabulator Firefox plug-in to help you navigate his data.

Patrick was alerted to Dave’s announcement by Tony Hirst who amplified Dave’s challenge “DON’T YOU DARE NOT DO THIS…”

As Dave puts it, your library is sitting on a goldmine of useful data that should be mined (and refined by sharing with that of other libraries).  A hat tip to Dave for doing this, and another one for using a sensible open licence to do it with.

Picture published by ToOliver2 on Flickr

Google Analytics to analyse student course activity – Tony Hirst Talks with Talis

Tony Hirst Tony Hirst, of the Open University Department of Communications and Systems, was recognised at the Online Information Conference 2008 for his work promoting new technologies in education by being presented with a commendation in the IWR Information Professional of the Year Award.

The award took place at the end of the first day of the Online Information Conference 2008.  Earlier in the day Tony delivered a presentation entitled “Course Analytics – using Google Analytics to understand student behaviour in an online Open University course”

I caught up with Tony just after his award  and we retired to a side room to discuss what he had learnt from work with Google Analytics.

 

Picture of Tony published on Flickr by MrGluSniffer

Catching the next wave

Catching the next wave was the title of my opening track keynote presentation in the “Catching the semantic wave – or down in a sea of content?” session of the “Order out of chaos – creating structure in our information universe” track at the Online Information Conference 2008.  Presentation below from Slideshare.

[slideshare id=812920&doc=rjwonlinedec08-1228306147696648-8&w=425]

This is a very well attended track.  Standing room only in most of the sessions, great interest in the Semantic Web, Web 2.0, and associated concepts and technologies.  From a lightly attended single session last year, this topic has grown in to an over subscribed 2nd track this year.  Having spent some time bending the ear of conference chair Adrian Dale last year about what was upcoming, I can wear my virtual I told you so hat with pride this year.  

My job as keynote was to provide a broad introduction to, and context for, things like Linked Open Data, the Semantic Web, Cloud Computing and clouds of data, setting the scene for the day.  Hopefully I was successful in my objective, the number of attendees is definitely a measure of the interest in the topics covered.

Considering that a large proportion of the attendees of the conference are librarians it is gratifying to note that they are already looking beyond the current Web 2.0 meme towards what will be washing over us next.    Thinking about this, it is hardly surprising.  The next wave is far more associated with data, metadata, linking and recommending, than the Web 2.0 meme of social networking, blogging and wikiing.  Dare I say it out loud, but by generalisation librarians appear to be far more comfortable with the concerns of data than socially interacting. 

lod-datasets_2008-03-31I get the feeling that these concepts are going to get adopted in libraries far quicker than we would expect once they start to gain momentum.  This would be helped if we could get past some of the terminology confusion.  The main culprit in this confusion being between semantics/semantic analysis and the semantic web.  The web of data, as against [or to be more correct in addition to] the current web of documents, is how I see the semantic web.  A great example of the web of data in action is the Open Linking Data Project.

Tony Hirst on using Google Analytics to understand student behaviour

Tony Hirst from the Open University gave an interesting presentation entitled ‘Course Analytics’ in the afternoon session of this first day of the Online Information Conference 2008.

As a lecturer he gets no feedback as to how his online course materials are used.  Normal web site analytics would be OK for the basic OU site which is essentially a sales site for courses.  Traditional analysis tools track activity, what users download, etc.

Using Google Analytics (a free tool – you just need to put a code in each page to enable it), to give you more trends of use, how long students stay on pages, which pages they leave from, what type of connections they use, and even such things as screen sizes they are using.

The output from these enables the tuning of material to match the patterns of use.  By changing the format to fit screens better, or the amount of content to suit the time the students normally stay on a page, although that could be a bit of a circular argument.

An obvious move is to then to link the analysis of the use of other systems, such as the library, to get a broader picture of how students are, or are not, using course resources.

Jenny Levine @ Online Information Conference 2008

Good wifi, expensive questionable quality coffee – the conference moves on after the great start from Clay Shirky

You find me in track two with keynote, and Talking with Talis interviewee, Jenny Levine.  

Jenny starts with a proposal that we are now entering an eighth age of librarianship in which librarians will have to start  interacting with the web services cloud, where their users are using blogs, Facebook, Twitter, SMS, mobile phones, etc., etc.  She reference the previous seven ages as described in this 1999 paper by D. W. Krummel [pdf].  

As examples she spoke about library sites which are putting Twitter feeds on their home page which they then post to for library announcements or even serendipitous information about people borrowing new books etc.

She is a great proponent of RSS as a way of distributing information from library blogs, twitter feeds, newly catalogued books, updates from journal suppliers, etc. and not only delivering it in to individual users’ readers but in to the library web site to provide automatically up to date pages.  RSS is the glue that links stuff together.

Technorati Tags: ,

Clay Shirky opens Online Information Conference 2008

Well actually he was preceded by Conference Chair Adrian Dale who popped up this fascinating counter on the screen.  Although a simulation, it drives home just how much information is being created.
 bytes created
(click for the animated version)

Against this background Clay then presented on the theme from his latest excellent book Here Comes Everybody, that also formed the starting point for the Talking with Talis podcast I recorded with him for the Online Information Conference series.

Don’t get me wrong Clay’s book is good, but you can’t beat having him stood up there telling you about it.   By using examples, such as readers of a blog that covered political unrest in Thailand who then got upset when she then blogged about her new pink mobile phone;  or flash-mobs being arrested in Belarus for ‘eating ice cream’; and many others, he showed the way that the publishing cost has moved to zero for most people which means we can all do it, the ramifications of which is enormous.

For a more detailed commentary on his presentation check out Ewan McIntosh’s post, which appeared whilst Clay was still  answering questions from the stage – a feat I could never attempt to compete with!