Panlibus Blog

Archive for the 'Tagging' Category

A cloud of clouds

Let me start with a question – what is the collective noun for clouds? In trying to dream up a catchy title for this post, which you will discover once I’ve stopped waffling is about Word Clouds, I tried to discover from colleagues and places like answers.com what you call a collection of clouds. Answers received so far: a host, a storm, a front, and the one I chose – a cloud. I’m sure someone out there will be able to put me right on this, I’ll be monitoring the comments with interest.

Anyway, why am I so interested in [word] clouds all of a sudden? Well its is not all of a sudden, I’ve been interested word/tag clouds as a device for serendipitous browsing through a set of meta data based upon the popularity of words within, or tags associated with, information, for a while.

Flickr, Technorati, and LibraryThing, are all well know examples of the use of these clouds in a user interface. More examples are appearing almost daily.

The thing that triggered me to write this post was the appearance of a word cloud on the site for the BBC’s radio station Radio 1. Scroll down to the bottom of the page and you should see a display of the most popular words contained in SMS text messages sent to the station. This is refreshed every couple of minutes or so, so gives an insight in to what the station’s audience is thinking about. With the station receiving often in excess of 1,000 messages per hour, the theme behind the words displayed is an aggregate of a fair amount of input. The tool that displays this also checks for well know words, like the name of a group or DJ, and makes them a clickable link to more information.

The thing that struck me about this implementation is that the BBC just put it there with no explanation or hints, expecting that their online audience will understand that words in larger fonts are more popular than others in smaller fonts and the ones in blue are clickable. Not that many months ago I remember having to explain those concepts to those seeing Flickr and del.icio.us tag clouds for the first time.

The Web 2.0/Library 2.0 world is one where new user interface metaphors appear and become accepted very rapidly. Although, I am still aware of some libraries who shy away from making changes to their OPACs until ‘there has been training‘. All I can say to such organizations is that I think you will find your online audience is more astute and open to change than you think. By all means offer some ‘How to get the most from the new features’ sessions, but if you have to train in the basics you have probably got your interface wrong.

Another thing that made me think about word clouds today, was a comment that somebody made in a telephone conversation about the Aquabrowser OnLine trials of libraries, such as Islington Libraries, who have contributed to the Talis Platform, that I posted about the other day. The comment passed on from a further education college was that the word cloud in the Aquabrowser OnLine interface could be of great help to those with dyslexic problems identifying different spellings etc. Another good example of how offering access to data by using new and innovative user interface metaphors, in addition to the traditional ones, can have unexpected beneficial consequences.

Technorati Tags: , , , , , , , ,

Addictive cataloguing by the masses

You’ve got to hand it to those Google guys for coming up with out-of-the-box thinking.

Take Google Image Labeler for instance.  The worst thing about this latest Beta from the World Domination stable of ideas is the name.  As John Battelle points out.

As John also points out, what Google call labels the rest of the planet know as tags.

I just wish Google would use the terminology the rest of the web has already settled upon. It’s not a label. It’s a tag. “Tag” means something – an intentional attribute given to an object on the web. That’s what we are doing here. How about we help Google come up with a new name?

So what is it then?  It is two things:

  • An addictive bit of simple fun.  You are randomly partnered with someone else then the two of you have 90 seconds to agree on at least one label for each of the images [from within Google Image Search] you are presented with.  If you both enter the same label, you gain 100 points and another image is presented.

    An ideal bit of fun to dip in to for a few minutes the next time you fill your coffee cup.  Be warned though, be prepared for you to be still playing it as you finally drain the cup!

  • An innovative way of building up folksonomy around the images that Google reference.  By harnessing peoples natural addiction to this sort of game, [As of the moment someone named eGrunt has amassed the staggering total of 1,324,400 points – does this person sleep!]  they are rapidly building up a human-validated set of search tags for their images – all for free.  At the moment there does not seem to be any value, other than qudos, attached to the points gained.

Google, like many of us who have tried to find relevant images from their Image Search, have identified that just scouring the page [that contains an image] for relevant keywords is not as useful as you would expect in cataloguing the image its self.

One benefit unique advantage Google have in launching such an initiative is their global reach.  They launch a new Beta, within hours the Google watchers blog about it, within a day or so thousands are playing with it.

Would something like this work for cataloging tagging your dusty collection – probably not as most players would grow old waiting for a partner.  But how long before a Google Book Search version appears? In which case the question will be, will Google see this as more secret-source or would they provide an open api to it?

 

 

Technorati Tags:,, ,

Belushi Book brings tears to cataloguers eyes

The Onion reports: Dewey Decimal System Helpless To Categorize New Jim Belushi Book

“With all due respect to the author, we remain unsure how to categorize this particular work,” said the chair of OCLC’s Editorial Policy Committee

I bet the social taggers, building up folksonomies, don’t have the same problems. To be fair though they are not trying to shoe-horn the book in to a rigid classification system – mind you isn’t that the point.

Listen to the Library 2.0 Gang

Anyway apart from being mildly amusing this gives me a good opportunity to recommend a listen to the Library 2.0 Gang podcast from a couple of weeks back on the subject of folksonomies and tagging – well worth a listen. On the Gang for this session were Casey Bisson, Ian Corns, Christina Pikas, Karen Schneider, and Tim Spalding.

Technorati Tags: , , , ,

Wikicat

The Wikimeadia Foundation the international non-profit organization behind some of the largest collaboratively-edited reference projects in the world including Wikipedia, have a project that has been running for the last few months named Wikicat.

Wikicat’s basic premise is to become the bibliographic catalog used by the Wikicite and WikiTextrose projects. The Wikicite project recognizes that “A fact is only as reliable as the ability to source that fact, and the ability to weigh carefully that source” and because of this the need to cite sources is recognized in the Wikipedia community standards. WikiTextrose is a project to analyze relationships between texts and is “inspired by long-established theories in the field of citation analysis

In simple terms the Wikicat project is attempting to assemble a bibliographic database [yes another one] of all the bibliographic works cited in Wikimedia pages.

It is going to do this initially by harvesting records via Z39.50 from other catalogues such as the Library of Congress, the National Library of Medicine, and others as they are added to their List of Wikicat OPAC Targets. Then when a citation, that includes a recognizable identifier such as ISBN or LOC number, is included in a page the authoritative bibliographic record can then be used to create a ‘correct’ citation. Eventually the act of citing a previously unknown [to Wikicat] work should automatically help to populate the Wikicat catalogue. – Participative cataloguing without needing to use the word folksonomy!

Putting aside the tempting discussion about can a Z39.50 target be truly described as an OPAC, the thing that is different about this cataloguing project is not what they are attempting to achieve but how they are going about it. The Wikicat home page states:

It will be implemented as a Wikidata dataset using a datamodel design based upon IFLA‘s Functional Requirements for Bibliographic Records (FRBR) [1], the various ISBD standards, the Library of Congress‘s MARC 21 specification, the Anglo-American Cataloguing RulesThe Logical Structure of the Anglo-American Cataloguing Rules, and the International Committee for Documentation (CIDOC)‘s Conceptual Reference Model (CRM)[2].

So it isn’t just going to be a database of Marc records then!

Reading more it is clear that once the initial objective of creating an automatic lookup of bibliographic records to create citations has been achieved, this could become a far more general open participative cataloguing project, complete with its own cataloguing rules managed by the WikiProject Librarians.

Because they are starting with FRBR at the core of the project, the quality, authority and granularity of the relationships between bibliographic entities potentially could be of the highest quality. This could lead to many benefits for the bibliographic community, not least a wikiXisbn service [my name] that is ‘better’ than OCLC’s xISBN.

So does the world need yet another cooperative cataloguing initiative? – working for an organisation that has cooperative cataloguing in its DNA for over thirty-five years, I should be careful how I answer this!

Throwing care to the wind – Yes. When you consider that all the other cooperative cataloguing initiatives [including as of today the one traditionally supported by Talis] are bounded by project, geographical, institutional, political, subject area, commercial, exclusive licensing, or high financial barrier to entry issues. What is refreshing about Wikicat is that, like Wikipedia, the only barrier to entry, both for retrieving and adding data, is Internet connectivity.

Unlike Wikipedia where some concerns about data quality are overridden by the value of it’s totally participative nature, the Wikicat team are clearly aware that the value of a bibliographic database is directly connected to the quality, consistency and therefore authority of the data that it holds. For this reason, the establishing of cataloguing rules and training for potential editors overseen by the WikiProject Librarians is already well detailed in the project operational stages roadmap.

I will be watching Wikicat with interest to see how it develops.

Technorati Tags: , , , , , ,

When is Local Global?

I am currently sat at the back of a hotel conference room in Leeds, in the morning session of one of the Talis Customer Days is in full flow. I am presenting on the future, Web/Library 2.0, and the Talis Platform during the afternoon session. Through the wonders of hotel broadband [it’s a wonder it ever works]; bringing a wireless router with me; and Virtual Private Networking [VPN] I’m not only physically in Leeds, I am virtually at the Talis Offices near Birmingham.

This is, frustratingly, very real – I now know via internal email that I have just missed the breakfast van; the sandwich delivery has just taken place at the Talis office, and its still two hours to lunch here in Leeds! An example of, virtual, local activity in a global context – because of ubiquitous Internet connectivity I could have been equally frustrated by messages about sandwiches anywhere on the planet.

This brought me thinking about the comments that have flowed from Paul Miller’s posting from a couple of days ago.

In discussing shared participation Paul commented:

there’s certainly a place for viewing comments by those geographically nearby. There must surely also be value in viewing comments across a community of interest, regardless of space. Yes, there’s already Amazon, but the comments are locked up there. A shareable pool of comments contributed to and consumed by libraries

This drew the, somewhat surprising to me, comment from John Blyberg at Ann Arbor:

In regards to shared participation, yes, I agree with you Paul that building a pool of contributed content could be a powerful and useful addition to any PAC. However, in a community such as Ann Arbor where both Ed and I live, my intuition tells me that we would want to avoid such a clearinghouse and opt for a community-built social software program. The reason is that (as most people in Ann Arbor would agree), our community is very unique and filled to the brim with book-lovers and library-users who could start building a database that belongs solely to our community and reflects the tastes and interests of the community, not the world at large. The main problem with a large shared database is that it is no longer unique and will ultimately align itself with the likes of Amazon.

It is at times like these that you have the realization that your assumptions are not always in line with everyone else’s.

So what are my assumptions then? Well firstly, the contributions of the citizens of Ann Arbor would be of great use, interest, and value to a far wider audience than just their district. Secondly, contributions to any global pool should be tagged as to their source and type. Thirdly, because of that tagging, selection of results should be able to be via many filters such as library, library authority or institution, library type, country, language etc.

So following through those assumptions in John’s situation, I would hope that contributions for my community would add value to the global pot; be displayable locally in isolation as a coherent set; and optionally could be supplemented by those from other appropriate communities around the country and the rest of the world.

To answer my own question in the title of this posting, providing data is tagged as to its source and type, Local is just a filtered view of Global resources so under the hood they can be the same thing.

Technorati Tags: ,

Catalogue enrichments get used

Dave Pattern at Huddersfield posts some initial usage figures for the various enhancements and enrichments he has added to his local catalogue, including alternate spellings, ‘also borrowed’ functionality, and more.

Although the figures may not be statistically robust, they provide some interesting pointers to the ways in which actual users are beginning to make use of the enhancements being made available to them.

Perhaps unsurprisingly, users would appear to value the added functionality delivered to them when we actually start working with the data we already hold, and I look forward to seeing more libraries following Dave’s example.

I also remain convinced that the biggest benefits will come when we do more to aggregate these data across libraries; ‘also borrowed’ across similar institutions to Huddersfield must surely be more relevant to a borrower than the data drawn from Huddersfield alone, where circulations are of a scale where odd edge effects (I borrow this and that, so when you borrow this you are also recommended that) must be more likely to surface?

Do you have any thoughts in this area? Feel free to share them in the TDN

Technorati Tags: , , , ,

Web 2.0 components visible in the wild… but hardly mainstream

Richard MacManus

Richard MacManus recently posted to his Web 2.0 Explorer blog on ZDNet, outlining five features of Web 2.0 that – he suggests – are now so mainstream as to not be that special anymore.

“A lot of the features and functionality of so-called Web 2.0 sites are now common elements in most current web apps and sites. It’s really gone beyond what was labelled ‘Web 2.0′ last year, because so many mainstream websites are now using these elements. It’s no longer a niche trend.”

The features and functions he highlights are:

  • tagging
  • aggregation
  • filters and ranking
  • syndication
  • mash-ups

Each of these is certainly more common than last year, but I’d argue that none of them are yet in mainstream deployment across even a significant minority of the sites that might beneficially use them.

Whilst an increasing number of commentators in this space take such fundamental shifts in approach as syndicating content and services and inviting user/customer/audience participation for granted, the reality on the ground remains very much Web 1.0. Just because the places to which we (most readers of this blog, probably) choose to give our attention are rushing to adopt these models, doesn’t mean that we are that far up the adoption curve yet. For example, Michael Arrington does a great job with TechCrunch, one of those blogs I make a point of reading every day, but the constant stream of innovative new companies discussed on his blog can be misleading. There’s a far larger pool of equally innovative companies, for whom transforming the way in which they invite interaction and custom online is perhaps less of a priority. In their particular industry, that different emphasis may well be appropriate… for now.

There’s a long way to go, and a lot of hard work to be done in evangelising about the visionary companies that Michael tracks, and what the changes they have made could mean to those following along behind. It’s not about copying, but about learning what works, what doesn’t, and how any of it helps you to meet the needs of those seeking to gain value from your offerings. We also need a better understanding of the ways in which existing organisations like Talis are adapting and evolving; you don’t need to be a start-up to be – or to do – Web 2.0.

When you live it every day, Web 2.0 is obvious. When you talk about it every day, Web 2.0 is old news. When you observe those to whom you are talking about it, you realise just how radical some of it is. We could all do, perhaps, with remembering how exciting these ideas were the first time we heard them or thought them.

Take a library example. We’re talking about sweeping aside the financial, technical and procedural barriers that make it so hard for libraries to tell both other libraries and library users about what they hold. We’re talking about making a Platform of data and services available, upon which third parties can orchestrate (possibly a better term than mash-up) new services in a way that fundamentally challenges existing software and data supply models in the sector. Once you’ve heard it, it seems blindingly obvious and eminently desirable. But despite that obviousness, no one else has done it.

Don’t let us permit familiarity to breed contempt.

Technorati Tags: , , , , ,

Tag – you’re it!

The Yahoo! Search Blog, has just published a very interesting entry on a new form of searching, they are calling My Web 2.0. You can read more about it at Search with a little help from your friends.

The premise behind social search is that no matter how powerful the search engine, it can’t contextualise the search results for you personally….yet. So, how can the search engine know that the results it is presenting are going to be relevant to you and your own tastes and preferences?

Yahoo have devised their own solution to this problem. It seems that they want to start harnessing some of the participative energy we are already seeing in the Web 2.0 environment. How about ranking search results based on what your trusted community has saved, tagged and shared?

Much like links and anchor text enabled major improvements in web search by becoming a new source of authority for search engines, people and trust networks are now an additional source of authority for social search engines. In the same way that blogs and RSS are empowering individuals to participate in publishing, individuals and communities can now participate in search, using tools like My Web2.0 that let them define what is valuable to them and their community”.

I like this concept. I was recently having a conversation with some colleagues about book reviews in Amazon. Although Amazon goes to great lengths to offer indications as to how valuable a review might be (ranking the review, ranking the reviewer etc). I still feel that with something like a review I want to “know” the reviewer in a way that means I trust their opinion.

It’s also quite interesting that Yahoo, use an example in their blog about Plasma TV screens. In a conversation, just a few days ago, I expressed an interest in buying a Plasma TV screen to my colleagues, who then bombarded me with a list of good sites for getting a good deal on Plasmas. In retrospect, it was a conscious decision on my part to qualify my investigation with trusted sources first, before I embarked on a web investigation.

Yahoo want to see My Web 2.0 evolve organically, enabling entire communities of interest to start using their platform to create their own search engines populated with content that has been tagged and shared within the community. I have two questions about this – what about if you have been merrily tagging your content in del.icio.us? Or Flickr? Are Yahoo and Del.icio.us going to create the APIs to enable tags to be imported? In a true spirit of Web 2.0 I would like to see Yahoo Search enable this interoperability.

My other question is about what impact this will have on libraries? The defence put up by libraries concerned about Google et al. has been that library search enables its community to find not just any source but a trusted source. In this context, that generally means a resource that has either been recommended by faculty or by the library, based on the vast amount of knowledge and understanding they have of the information landscape. So does “My Web 2.0”, infringe even further on the library space?

Two further observations to make at this stage:
1. Can “My Web2.0” make tagging the hidden web any easier? Academics currently have a method of tagging the resources they want their community (otherwise known as students in their class) to read, its called a reading list, and not surprisingly a lot of the material that gets tagged within a reading list, is physical content held by the library. Here Silkworm could really have a role to play, could a tagging tool like Del.ici.ous or “MyWeb2.0” use the Silkworm directory to locate resources that are physical but which have metadata that can describe the resource and point you to the location?
2. Is “MyWeb2.0” introducing layers of tagging based on us tagging the tagger? So for instance, you can see the following scenarios evolving:
a) Tags that are made by those people who I “know”, whom I have some form of direct relationship with. This is where I see “MyWeb2.0” being of immediate value. I have a community of interest based on my academic affiliation, or my research group.
b) Tags that are created by those taggers that I “know about”. Whose reputation I have come to respect or admire because I am part of that circle of participation. This is very much where both del.ici.ous and “MyWeb2.0” overlap.
c) Tags that are created by taggers who are an “unknown”, but I just come across them in a serendipitous fashion, and they may in time become “know about” taggers. This is again where del.ici.ious and indeed Technorati become useful.

To an extent all three scenarios, play well in an academic context. However, we have found from our experience with Talis List, that some academics are not keen to publish their reading lists to the wider community, they see it as their intellectual property. But, if we see reading lists as simply a set of tags, then the “MyWeb2.0” application could work very well.