CIL 2010: Library Engagement Through Open Data

Speakers: Oleg Kreymer & Dan Lipcan

Library data is meaningless in and of itself – you need to interpret it to give it meaning. Piotr Adamczyk did much of the work for the presentation, but was not able to attend today due to a schedule conflict.

They created the visual dashboard for many reasons, including a desire to expose the large quantities of data they have collected and stored, but in a way that is interesting and explanatory. It’s also a handy PR tool for promoting the library to benefactors, and to administrators who are often not aware of the details of where and how the library is being effective and the trends in the library. Finally, the data can be targeted to the general public in ways that catch their attention.

The dashboard should also address assessment goals within the library. Data visualization allows us to identify and act upon anomalies. Some visualizations are complex, and you should be sensitive to how you present it.

The ILS is a great source of circulation/collections data. Other statistics can come from the data collected by various library departments, often in spreadsheet format. Google Analytics can capture search terms in catalog searches as well as site traffic data. Download/search statistics from eresources vendors can be massaged and turned into data visualizations.

The free tools they used included IMA Dashboard (local software, Drupal Profile) and IBM Many Eyes and Google Charts (cloud software). The IMA Dashboard takes snapshots of data and publishes it. It’s more of a PR tool.

Many Eyes is a hosted collection of data sets with visualization options. One thing I like was that they used Google Analytics to gather the search terms used on the website and presented that as a word cloud. You could probably do the same with the titles of the pages in a page hit report.

Google Chart Tools are visualizations created by Google and others, and uses Google Spreadsheets to store and retrieve the data. The motion charts are great for showing data moving over time.

Lessons learned… Get administrative support. Identify your target audience(s). Identify the stories you want to tell. Be prepared for spending a lot of time manipulating the data (make sure it’s worth the time). Use a shared repository for the data documents. Pull from data your colleagues are already harvesting. Try, try, and try again.

CiL 2008: Speed Searching

Speaker: Greg Notess

His talk summarizes points from his Computers in Libraries articles on the same topic, so go find them if you want more details than what I provide.

It takes time to find the right query/database, and to determine the best terminology to use in order to find what you are seeking. Keystroke economy makes searching faster, like the old OCLC FirstSearch 3-2-2-1 searching. Web searching relevancy is optimized by using only a few unique words rather than long queries. Do spell checking through a web search and then take that back into a reference database. Search suggestions on major search engines help with the spelling problem, and the suggestions are ranked based on the frequency with which they are searched, but they require you to type slowly to use them effectively and increase your search speed. Copy and paste can be enhanced through browser plugins or bookmarklets that allow for searching based on selected text.

The search terms matter. Depending on the source, average query length searches using unique terms perform better over common search terms or long queries. Use multiple databases because it’s fun, you’re a librarian, and there is a lack of overlap between data sources.

Search switching is not good for quick look-ups, but it can be helpful with hard to find answers that require in-depth query. We have a sense that federated searching should be able to do this, but some resources are better searched in their native interfaces in order to find relevant sources. There are several sites that make it easy to switch between web search engines using the same query, including a nifty site that will allow you to easily switch between the various satellite mapping sources for any location you choose.

I must install the Customize Google Firefox plugin. (It’s also available for IE7, but why would you want to use IE7, anyway?)

CiL 2008: Catalog Effectiveness

Speaker: Rebekah Kilzer

The Ohio State University Libraries have used Google Analytics for assessing the use of the OPAC. It’s free for sites up to five million page views per month — OSU has 1-2 million page views per month. Libraries would want to use this because most integrated library systems offer little in the way of use statistics, and what they do have isn’t very… useful. You will need to add some code that will display on all OPAC pages.

Getting details about how users interact with your catalog can help with making decisions about enhancements. For example, knowing how many dial-up users interact with the site could determine whether or not you want to develop style sheets specifically for them, for example. You can also track what links are being followed, which can contribute to discussions on link real estate.

There are several libraries that are mashing up Google Analytics information with other Google tools.


Speakers: Cathy Weng and Jia Mi

The OPAC is a data-centered, card-catalog retrieval system that is good for finding known items, but not so good as an information discovery tool. It’s designed for librarians, not users. Librarian’s perceptions of users (forgetful, impatient) prevents librarians from recognizing changes in user behavior and ineffective OPAC design.

In order to see how current academic libraries represent and use OPAC systems, they studied 123 ARL libraries’ public interfaces and search capabilities as well as their bibliographic displays. In the search options, two-thirds of libraries use “keyword” as the default and the other third use “title.” The study also looked at whether or not the keyword search was a true keyword search with an automatic “and” or if the search was treated as a phrase. Few libraries used relevancy ranking as the default search results sorting.

There are some great disparities in OPAC quality. Search terms and search boxes are not retained on the results page, post-search limit functions are not always readily available, item status are not available on search results page, and the search keywords are not highlighted. These are things that the most popular non-library search engines do, which is what our users expect the library OPAC to do.

Display labels are MARC mapping, not intuitive. Some labels are suitable for certain types of materials but not all (proper name labels for items that are “authored” by conferences). They are potentially confusing (LCSH & MeSH) and occasionally inaccurate. The study found that there were varying levels of effort put to making the labels more user-friendly and not full of library jargon.

In addition to label displays, OPACs also suffer from the way the records are displayed. The order of bibliographic elements effect how users find relevant information to determine whether or not the item found is what they need.

There are three factors that contribute to the problem of the OPAC: system limitations, libraries not exploiting full functionality of ILS, and MARC standards are not well suited to online bibliographic display. We want a system that doesn’t need to be taught, that trusts users as co-developers, and we want to maximize and creatively utilize the system’s functionality.

The presentation gave great examples of why the OPAC sucks, but few concrete examples of solutions beyond the lipstick-on-a-pig catalog overlay products. I would have liked to have a list of suggestions for label names, record display, etc., since we were given examples of what doesn’t work or is confusing.

what’s wrong with a little enthusiasm?

Rory Litwin thinks blogs are over-rated.

Rory Litwin has some pretty harsh words about librarians who are still excited about the web and new web-related technologies in the latest issue of Library Juice. I’m beginning to suspect that he likes picking virtual fights.

“As an example I would like to cite the blogging craze – and it is a craze in its current form – because so many people, librarians included, have started their own blogs for no discernible reason and through blogs have renewed their irrational excitement about the Web in general.”

This statement might very well apply to my blog, since I don’t have any particular focus other than my own interests. Possibly, my comments would be better served in the form of a private off-line journal, or as email messages sent to certain friends. However, in the past year I have approached my blog with the mentality of being a part of a wider community of my peers, much like the way other scholarly communication has been done for centuries. I don’t think I’ve gotten to the point where my little essays and opinions will be quoted and passed around, but I’m working my way there. I see this as a tool to contribute to the wider conversation in the profession.

There are other blogs that are more focused and in many ways are the best supplements to officially recognized professional literature that I have found. Jessamyn West and the LISNews collaborative blog are my two main sources of recent news about library-related issues. I’m finding out about things well before they show up in any of the traditionally recognized mediums. Jenny Levine and Sarah Houghton keep me up to speed on the latest technology that may impact my work. Half the stuff they write about will likely never show up in the professional literature, even if it should.

There are other blogs out there that are less insightful or informative than those I mentioned above. In fact, as was the case when personal web pages were the new fad, there are quite a few blogs out there that are little more than public diaries. However, I think that Litwin is throwing the baby out with the bath water when he chastises librarians for their excitement about the blog medium.

“Many people are now using the blog format where a chronological organization is not appropriate to the content they are putting up, for no other reason than that blogs are hot and there are services supporting them. This is irrational. I feel that librarians should be a little more mature and less inclined to fall for Internet crazes like this. That is not to say that a blog is never a useful thing, only that blogs – as everything on the web – should be seen for what they are and not in terms of a pre-existing enthusiasm.”

As with any new toy, eventually the shine will wear off and those folks will realize that the blog medium, regardless of its simplicity or fashion, does not fit their needs. Since Litwin does not provide specific examples of these inappropriate uses of blogs, I cannot address them. My experience with librarian blogs has been such that the chronological format works well. There is only one instance that I know of in which the blog format may not fit. The reference team at my library has replaced their frequently asked questions notebook and miscellaneous announcements notes with a Blogger weblog. The advantage of this format is that the contents are easily searchable. The disadvantage is that several workarounds have been used to organize the entries. I suspect that what they really need is a blog for the announcement bits and a separate wiki for the “this is a good resource for (fill in the blank)” type entries. I am confident that eventually they will move on to some other format that better serves their needs, and in the meantime, they will have become familiar with yet another piece of modern technology.

Quite a few of the new blogs that are created daily by librarians never make it out of their infancy. For the most part, they’re too busy or uninterested or have nothing to write about. Still, I think it’s important for librarians to try new things, and if blogs are the latest internet fad, then at least librarians should play with them long enough to evaluate them. My first blog was called “because everyone else is doing it” and was basically a public forum for occasional rants, links, commentary, and some library-related information. It was a good experiment, and as I became more familiar with the tools, I began to see other uses for blogs. The chronological format works well for my radio playlists.

Blogs introduced me to RSS feeds, and from there I have been thinking of several different ways librarians could use RSS. It even instilled a desire to learn Perl and PHP so that I could know enough coding to hack a feed of our new acquisitions as they are added to the collection. If we’re going to put up new book lists, then why not also make a feed for them? The University of Louisville Library not only provides RSS feeds for their new books, they also have subject-specific feeds. Soon it may be possible to create feeds from saved searches in the catalog, much like what some online news sources provide. Those feeds would be even more specific and would alert faculty, graduate students, or anyone else interested, when new items are cataloged that fit the search terms. I digress.

All this is to say that weblogs are useful, and that librarians should be savvy enough to know when and where to make use of them. We all aren’t permanently dazzled by new shiny toys.

I look forward to reading responses to Litwin’s essay in the librarian blogosphere.