Speakers: Michael Levine-Clark (University of Denver) & Kari Paulson (ProQuest eBrary/EBL)
ProQuest is looking at usage data across the eBrary and EBL platforms as they are working to merge them together. To help interpret the data, they asked Levine-Clark to look at it as well. This is more of a proof-of-concept than a final conclusion.
They looked at 750,000 ebooks initially, narrowing it down for some aspects. He asked several questions, from the importance of quality to disciplinary preferences to best practices for measuring use, and various tangential questions related to these.
They looked at eBrary data from 2010-2013Q3 and EBL data from 2011-2013Q3. They used only the titles with an LC call number, and separate analysis of those titles that come from university presses specifically.
Usage was defined in three ways: sessions, views (count of page views), and downloads (entire book). Due to the variations in the data sets (number of years, number of customers, platforms), they could not easily compare the usage information between eBrary and EBL.
Do higher quality ebooks get used more? He used university press books as a measure of quality, though he recognizes this is not the best measure. For titles with at least one session, he found that the rate of use was fairly comparable, but slightly higher for university press books. The session counts and page views in eBrary was significantly higher for UP books, but not as much with EBL. In fact, consistently use was higher for UP books across the categories, but this may be because there are more UP books selected by libraries, thus increasing their availability.
What does usage look like across broad disciplines? Humanities, Social Sciences, and STEM were broken out and grouped by their call number ranges. He excluded A & Z (general) as well as G (too interdisciplinary) out of the equation. The social sciences were the highest in sessions and views on eBrary, but humanities win the downloads. For EBL, the social sciences win all categories. When he looked at actions per session, STEM had higher views, but all downloaded at about the same rate on both platforms.
How do you measure predicted use? He used the percentage of books in an LC class relative to the total books available. If the percentage of a use metric is lower then it is not meeting expected use, and vice versa. H, L, G, N, and D were all better than expected. Q, F, P, K and U were worse than expected.
How about breadth versus depth? This gets complicated. Better to find the slides and look at the graphs. The results map well to the predicted use outcomes.
Can we determine the level of immersion in a book? If more pages are viewed per session in a subject area, does that mean the users spend more time reading or just look at more pages? Medicine (R), History of the Americas (F), and Technology (T) appear to be used at a much higher rate within a session than other areas, despite performing poorly in breadth versus depth assessment. In other words, they may not be used much per title, but each session is longer and involves more actions than others.
How do we use these observations to build better collections and better serve our users?
Books with call numbers tend to be use more than those without. Is it because a call number is indicative of better metadata? Is it because publishers of better quality will provide better metadata? It’s hard to tell at this point, but it’s something he wants to look into.
A white paper is coming soon and will include a combined data set. It will also include the EBL data about how long someone was in a book in a session. Going forward, he will also look into LC subclasses.