#libday8 day 4 — lies, damn lies, and statistics

How to Lie with Statistics cover
How to Lie with Statistics by Darrell Huff & Irving Geis

My day began with organizing and prioritizing the action items that arrived yesterday when I was swamped with web-scale discovery service presentations. I didn’t get very far when it was time to leave for a meeting about rolling out VuFind locally. Before that meeting, I dropped in to update my boss (and interim University Librarian) on some things that came out of the presentations and subsequent hallway discussions.

At the VuFind meeting, we discussed some tweaks and modifications, and most everyone took on some assignments to revise menu labels, record displays, and search options. I managed to evade an assignment only because these things are more for reference, cataloging, and web services. The serials records look fine and appear accurately in the basic search (from the handful of tests I ran), so I’m not concerned about tweaking anything specifically.

Back at my desk, I started to work on the action items again, but the ongoing conversations about the discovery service presentations distracted me until one of the reference librarians provided me with a clue about the odd COUNTER use stats we’ve received from ProQuest for 2011.

I had given her stats on a resource that was on the CSA platform, but for the 2011 stats I provided what ProQuest gave me, which were dubious in their sudden increase (from 15 in 2010 to 4756 in 2011). She made a comment about how the low stats didn’t surprise her because she hates teaching the Illumina platform. I said it should be on the ProQuest platform now because that’s where the stats came from. She said she’d just checked the links on our website, and they’re still going to Illumina.

This puzzled me, so I pulled the CSA stats from 2011, and indeed, we had only 17 searches for the year for this index. I checked the website and LibGuides links, and we’re still sending users to the Illumnia platform, and not ProQuest. So, I’m not sure where those 4756 searches were coming from, but their source might explain why our total ProQuest stats tripled in 2011. This lead me to check our federated search stats, and while it shows quite a few searches of ProQuest databases (although not this index, as we hadn’t included it), our DB1 report shows zero federated searches and sessions.

I compiled all of this and sent it off to ProQuest customer support. I’m eager to see what their response will be.

This brought me up to my lunch break, which I spent at the gym where one of the trainers forced my compatriots and I to accomplish challenging and strenuous activities for 45 min. After my shower, I returned to the library to lunch at my desk and respond to some crowd-sourced questions from colleagues at other institutions.

I managed to whack down a few email action items before my ER&L co-presenter called to discuss the things we need to do to make sure we’re prepared for the panel session. We’re pulling together seasoned librarians and product representatives from five different electronic resource management systems (four commercial, one open-source) to talk about their experiences working with the products. We hashed out a few things that needed hashing out, and ended the call with more action items on our respective lists.

At that point, I had about 20 min until my next meeting, so I tracked down the head of research and instruction to hash out some details regarding the discovery service presentations that I wanted to make sure she was aware of. I’m glad I did, because she filled in some gaps I had missed, and later she relayed a positive response from one of the librarians that concerned both of us.

The meeting ended early, so I took the opportunity of suddenly unscheduled time in my calendar to start writing down this whole thing. I’d been so busy I hadn’t had time to journal this throughout the day like I’d previously done.

Heard back from ProQuest, and although they haven’t addressed the missing federated search stats from their DB1 report, they explain away the high number of searches in this index as having come from a subject area search or the default search across all databases. There was (and may still be) a problem with defaulting to all databases if the user did not log out before starting a new session, regardless of which database they intended to use. PQ tech support suggested looking at their non-COUNTER report that includes full-text, citation, and abstract views for a more accurate picture of what was used.

For the last stretch of the day, I popped on my headphones, cranked up the progressive house, and tried to power through the rest of the email action items. I didn’t get very far, as the first one required tracking down use stats and generating a report for an upcoming renewal. Eventually, I called it a day and posted this. Yay!

CiL 2008: Catalog Effectiveness

Speaker: Rebekah Kilzer

The Ohio State University Libraries have used Google Analytics for assessing the use of the OPAC. It’s free for sites up to five million page views per month — OSU has 1-2 million page views per month. Libraries would want to use this because most integrated library systems offer little in the way of use statistics, and what they do have isn’t very… useful. You will need to add some code that will display on all OPAC pages.

Getting details about how users interact with your catalog can help with making decisions about enhancements. For example, knowing how many dial-up users interact with the site could determine whether or not you want to develop style sheets specifically for them, for example. You can also track what links are being followed, which can contribute to discussions on link real estate.

There are several libraries that are mashing up Google Analytics information with other Google tools.


Speakers: Cathy Weng and Jia Mi

The OPAC is a data-centered, card-catalog retrieval system that is good for finding known items, but not so good as an information discovery tool. It’s designed for librarians, not users. Librarian’s perceptions of users (forgetful, impatient) prevents librarians from recognizing changes in user behavior and ineffective OPAC design.

In order to see how current academic libraries represent and use OPAC systems, they studied 123 ARL libraries’ public interfaces and search capabilities as well as their bibliographic displays. In the search options, two-thirds of libraries use “keyword” as the default and the other third use “title.” The study also looked at whether or not the keyword search was a true keyword search with an automatic “and” or if the search was treated as a phrase. Few libraries used relevancy ranking as the default search results sorting.

There are some great disparities in OPAC quality. Search terms and search boxes are not retained on the results page, post-search limit functions are not always readily available, item status are not available on search results page, and the search keywords are not highlighted. These are things that the most popular non-library search engines do, which is what our users expect the library OPAC to do.

Display labels are MARC mapping, not intuitive. Some labels are suitable for certain types of materials but not all (proper name labels for items that are “authored” by conferences). They are potentially confusing (LCSH & MeSH) and occasionally inaccurate. The study found that there were varying levels of effort put to making the labels more user-friendly and not full of library jargon.

In addition to label displays, OPACs also suffer from the way the records are displayed. The order of bibliographic elements effect how users find relevant information to determine whether or not the item found is what they need.

There are three factors that contribute to the problem of the OPAC: system limitations, libraries not exploiting full functionality of ILS, and MARC standards are not well suited to online bibliographic display. We want a system that doesn’t need to be taught, that trusts users as co-developers, and we want to maximize and creatively utilize the system’s functionality.

The presentation gave great examples of why the OPAC sucks, but few concrete examples of solutions beyond the lipstick-on-a-pig catalog overlay products. I would have liked to have a list of suggestions for label names, record display, etc., since we were given examples of what doesn’t work or is confusing.

css.php