The Bigger Picture: Creating a Statistics Dashboard That Ties Collection Building to Research
Speaker: Shannon Tharp, University of Wyoming
How can they tie the collection building efforts with the university’s research output? Need to articulate value to the stakeholders and advocate for budget increases.
She used Tableau to develop the dashboard and visualizations. Started with a broad overview of collections and then have expanded from there. The visualizations include a narrative and an intuitive interface to access more information.
The dashboard also includes qualitative interviews of faculty and research staff. They are tentatively calling this “faculty talk” and plan to have it up soon, with rotating interviews displaying. They are thinking about including graduate and undergraduate student interviews as well.
(e)Book Snapshot: Print and eBook Use in an Academic Library Consortium
Speaker: Joanna Voss, OhioLINK
What can we do to continue to meet the needs of students and faculty through the print to electronic book transition? Are there any patterns or trends in their use that will help? Anecdotally we hear about users preferring print to electronic. How do we find data to support this and to help them?
They cleaned up the data using Excel and OpenRefine, and then used Tableau for the analysis and visualization. OpenRefine is good for really messy data.
A Brief History of PaperStats
Speaker: Whitney Bates-Gomez, Memorial Sloan Kettering Cancer Center
Web-based tool for generating cost-per-use reports. It’s currently in beta and only working with JR1 reports. It works most of the time for COUNTER and SUSHI reports, but not always. The costs function requires you to upload the costs in a CSV format, and they were able to get that data from their subscription agent.
But, too bad for you, it’s going away at the end of the spring, but there might be a revised version out there some day. It’s through PubGet and Copyright Clearance Center decided to not renew their support.
Speakers: Michael Levine-Clark (University of Denver) & Kari Paulson (ProQuest eBrary/EBL)
ProQuest is looking at usage data across the eBrary and EBL platforms as they are working to merge them together. To help interpret the data, they asked Levine-Clark to look at it as well. This is more of a proof-of-concept than a final conclusion.
They looked at 750,000 ebooks initially, narrowing it down for some aspects. He asked several questions, from the importance of quality to disciplinary preferences to best practices for measuring use, and various tangential questions related to these.
They looked at eBrary data from 2010-2013Q3 and EBL data from 2011-2013Q3. They used only the titles with an LC call number, and separate analysis of those titles that come from university presses specifically.
Usage was defined in three ways: sessions, views (count of page views), and downloads (entire book). Due to the variations in the data sets (number of years, number of customers, platforms), they could not easily compare the usage information between eBrary and EBL.
Do higher quality ebooks get used more? He used university press books as a measure of quality, though he recognizes this is not the best measure. For titles with at least one session, he found that the rate of use was fairly comparable, but slightly higher for university press books. The session counts and page views in eBrary was significantly higher for UP books, but not as much with EBL. In fact, consistently use was higher for UP books across the categories, but this may be because there are more UP books selected by libraries, thus increasing their availability.
What does usage look like across broad disciplines? Humanities, Social Sciences, and STEM were broken out and grouped by their call number ranges. He excluded A & Z (general) as well as G (too interdisciplinary) out of the equation. The social sciences were the highest in sessions and views on eBrary, but humanities win the downloads. For EBL, the social sciences win all categories. When he looked at actions per session, STEM had higher views, but all downloaded at about the same rate on both platforms.
How do you measure predicted use? He used the percentage of books in an LC class relative to the total books available. If the percentage of a use metric is lower then it is not meeting expected use, and vice versa. H, L, G, N, and D were all better than expected. Q, F, P, K and U were worse than expected.
How about breadth versus depth? This gets complicated. Better to find the slides and look at the graphs. The results map well to the predicted use outcomes.
Can we determine the level of immersion in a book? If more pages are viewed per session in a subject area, does that mean the users spend more time reading or just look at more pages? Medicine (R), History of the Americas (F), and Technology (T) appear to be used at a much higher rate within a session than other areas, despite performing poorly in breadth versus depth assessment. In other words, they may not be used much per title, but each session is longer and involves more actions than others.
How do we use these observations to build better collections and better serve our users?
Books with call numbers tend to be use more than those without. Is it because a call number is indicative of better metadata? Is it because publishers of better quality will provide better metadata? It’s hard to tell at this point, but it’s something he wants to look into.
A white paper is coming soon and will include a combined data set. It will also include the EBL data about how long someone was in a book in a session. Going forward, he will also look into LC subclasses.
I need to find a happy medium between self-paced instruction and structured instruction.
I signed up for a Coursera class on statistics for social science researchers because I wanted to learn how to better make use of library data and also how to use the open source program for statistical computing, R. The course information indicated I’d need to plan for 4-6 hours per week, which seemed doable, until I got into it.
The course consists of several lecture videos, most of which include a short “did you get the main concepts” multiple-choice quiz at the end. Each week there is an assignment and graded quiz, and of course a midterm and final.
It didn’t help that I started off behind, getting through only a lecture or two before the end of the first week, and missing the deadline for having the first assignment and quiz graded. I scrambled to catch up the second week, but once again couldn’t make it through the lectures in time.
That’s when I realized that it was going to take much longer than projected to keep up with this course. A 20-30 min lecture would take me 45-60 min to get through because I was constantly having to pause and write notes before the lecturer went on to the next concept. And since I was using Microsoft OneNote to keep and organize my notes, anything that involved a formula took longer to copy down.
By the end of the third week, I was still a few lectures away from finishing the second week, and I could see that it would take more time than I had to keep going, but I decided to go another week and do what I could.
That was this week, and I haven’t had time to make any more progress than where I was last week. With no prospect of catching up before the midterm deadline, I decided to withdraw from the course.
This makes me both disappointed in myself and in the structure of the course. I hate quitting, and I really want to learn the stuff. But, as I fell further and further behind, it became easier to put it off and focus on other overdue items on my task list, and thus compounding the problem.
The instructor for the course was easy to follow, and I like his lecture style, but when it came time to do the graded quiz and assignment, I realized I clearly had not understood everything, or he expected me to have more of a background in the field than a novice. It also seemed like the content was geared towards a 12 week course and with this being only 8 weeks, rather than reduce the content accordingly, he was cramming it all into those 8 weeks.
Having deadlines was a great motivation to keep up with the course, which I haven’t had when I’ve tried to learn on my own. It was the volume of content to absorb between those deadlines that tripped me up. I need to find a happy medium between self-paced instruction and structured instruction.
My day began with organizing and prioritizing the action items that arrived yesterday when I was swamped with web-scale discovery service presentations. I didn’t get very far when it was time to leave for a meeting about rolling out VuFind locally. Before that meeting, I dropped in to update my boss (and interim University Librarian) on some things that came out of the presentations and subsequent hallway discussions.
At the VuFind meeting, we discussed some tweaks and modifications, and most everyone took on some assignments to revise menu labels, record displays, and search options. I managed to evade an assignment only because these things are more for reference, cataloging, and web services. The serials records look fine and appear accurately in the basic search (from the handful of tests I ran), so I’m not concerned about tweaking anything specifically.
Back at my desk, I started to work on the action items again, but the ongoing conversations about the discovery service presentations distracted me until one of the reference librarians provided me with a clue about the odd COUNTER use stats we’ve received from ProQuest for 2011.
I had given her stats on a resource that was on the CSA platform, but for the 2011 stats I provided what ProQuest gave me, which were dubious in their sudden increase (from 15 in 2010 to 4756 in 2011). She made a comment about how the low stats didn’t surprise her because she hates teaching the Illumina platform. I said it should be on the ProQuest platform now because that’s where the stats came from. She said she’d just checked the links on our website, and they’re still going to Illumina.
This puzzled me, so I pulled the CSA stats from 2011, and indeed, we had only 17 searches for the year for this index. I checked the website and LibGuides links, and we’re still sending users to the Illumnia platform, and not ProQuest. So, I’m not sure where those 4756 searches were coming from, but their source might explain why our total ProQuest stats tripled in 2011. This lead me to check our federated search stats, and while it shows quite a few searches of ProQuest databases (although not this index, as we hadn’t included it), our DB1 report shows zero federated searches and sessions.
I compiled all of this and sent it off to ProQuest customer support. I’m eager to see what their response will be.
This brought me up to my lunch break, which I spent at the gym where one of the trainers forced my compatriots and I to accomplish challenging and strenuous activities for 45 min. After my shower, I returned to the library to lunch at my desk and respond to some crowd-sourced questions from colleagues at other institutions.
I managed to whack down a few email action items before my ER&L co-presenter called to discuss the things we need to do to make sure we’re prepared for the panel session. We’re pulling together seasoned librarians and product representatives from five different electronic resource management systems (four commercial, one open-source) to talk about their experiences working with the products. We hashed out a few things that needed hashing out, and ended the call with more action items on our respective lists.
At that point, I had about 20 min until my next meeting, so I tracked down the head of research and instruction to hash out some details regarding the discovery service presentations that I wanted to make sure she was aware of. I’m glad I did, because she filled in some gaps I had missed, and later she relayed a positive response from one of the librarians that concerned both of us.
The meeting ended early, so I took the opportunity of suddenly unscheduled time in my calendar to start writing down this whole thing. I’d been so busy I hadn’t had time to journal this throughout the day like I’d previously done.
Heard back from ProQuest, and although they haven’t addressed the missing federated search stats from their DB1 report, they explain away the high number of searches in this index as having come from a subject area search or the default search across all databases. There was (and may still be) a problem with defaulting to all databases if the user did not log out before starting a new session, regardless of which database they intended to use. PQ tech support suggested looking at their non-COUNTER report that includes full-text, citation, and abstract views for a more accurate picture of what was used.
For the last stretch of the day, I popped on my headphones, cranked up the progressive house, and tried to power through the rest of the email action items. I didn’t get very far, as the first one required tracking down use stats and generating a report for an upcoming renewal. Eventually, I called it a day and posted this. Yay!
The Library Journal periodicals price survey was developed in partnership with EBSCO when the ALA pulled the old column to publish in American Libraries. There is a similar price survey being done by the AALL for law publications.
There is a difference between a price survey and a price index. A price survey is a broad look, and a price index attempts to control the categories/titles included.
[The next bit was all about the methodology behind making the LJ survey. Not why I am interested, so not really taking notes on it.]
Because of the challenge of getting pricing for ejournals, the survey is based mainly on print prices. That being said, the trends in pricing for print is similar to that of electronic.
Knowing the trends for pricing in your specific set of journals can help you predict what you need to budget for. While there are averages across the industry, they may not be accurate depending on the mix in your collection. [I am thinking that this means that the surveys and indexes are useful for broad picture looks at the industry, but maybe not for local budget planning?]
It is important to understand what goes into a pricing tool and how it resembles or departs from local conditions in order to pick the right one to use.
Budgets for libraries and higher education are not in “recovery.” While inflation calmed down last year, they are on the rise this year, with an estimate of 7-8%. The impact may be larger than at the peak of the serials pricing crisis in the 1990s. Libraries will have less buying power, and users will have less resources, and publishers will have fewer customers.
Why is the inflation rate for serials so much higher than the consumer price index inflation rate? There has been an expansion of higher education, which adds to the amount of stuff being published. The rates of return for publishers are pretty much normal for their industry. There isn’t any one reason why.