ER&L 2016: Hard Data for Tough Choices: eBooks and pBooks in Academic Libraries

ebooks
“ebooks” by Randy Rodgers

Speakers: Katherine Leach and Matthew Connor Sullivan, Harvard

eBooks have not supplanted pBooks. Providing access to both formats is not possible…even for Harvard.

Users really do want and use both. There is a need for a better understanding of user behavior for both formats.

In 2014, they purchased the complete Project Muse collection, which included a significant and intentional overlap with their print collection. This allowed for a deep comparison and analysis.

You cannot compare them directly in a meaningful way. There are many ways of counting eBooks and pBooks are notoriously undercounted in their use. They looked at whether or not a book was used, and if it was used in only one format or multiple, and then how that compared to the average use across the collection.

26% of titles were used in both formats over the time period, only .5% on a monthly basis. It’s sometimes suggested that eBooks are used for discovery, but even at the monthly level this is not reflected in the data. The pattern of use of each format is generally about the same over the semester, but eBook use tends to be a little behind the pBook use. But, again, it’s difficult to get precise patterns of eBook use with monthly reports. There was no significant differences in format use by subject classification or imprint year or publisher, particularly when factoring the number of titles in each category.

They looked at the average decrease of a pBook over a four year period. They found a 35% decrease in circulation for each imprint year over that time, and this is without any impact of eBook. This is not always factored into these kinds of studies. They found that the decrease increases to 54% when eBooks are added to the mix. There’s also the issue of print use decreasing generally, with monographs losing out to eresources in student and faculty citation studies.

HSS at Harvard has been very clear that they want to continue the print collection at the level it has been, but they also want electronic access. How do we work with publishers to advocate for electronic access without having to purchase the book twice?

Audience Q&A:
What about providing short term loan access for the first 3-4 years? Harvard doesn’t like to purchase eBooks they don’t have perpetual access to.

P&E has been available for journals, why not books? Some publishers have worked with them to give deep discounts on print with an eBook package.

What has been the impact of electronic reserves on use? Haven’t looked at it.

How do you know if someone looked at the eBook and determined they didn’t want/need and that is why the pBook wasn’t used? Hard to determine. They don’t use eBook usage to drive the print acquisition — usually they already have the pBook.

Considering the lifecycle and the decrease in use over a short period of time from imprint year, does that cause you to question the purchase of eBook backfiles? eBook use over that time didn’t seem to decrease as significantly as the pBook.

ER&L 2016: Collections and Use

Infographics
“Infographics” by AJ Cann

The Bigger Picture: Creating a Statistics Dashboard That Ties Collection Building to Research
Speaker: Shannon Tharp, University of Wyoming

How can they tie the collection building efforts with the university’s research output? Need to articulate value to the stakeholders and advocate for budget increases.

She used Tableau to develop the dashboard and visualizations. Started with a broad overview of collections and then have expanded from there. The visualizations include a narrative and an intuitive interface to access more information.

The dashboard also includes qualitative interviews of faculty and research staff. They are tentatively calling this “faculty talk” and plan to have it up soon, with rotating interviews displaying. They are thinking about including graduate and undergraduate student interviews as well.

 

(e)Book Snapshot: Print and eBook Use in an Academic Library Consortium
Speaker: Joanna Voss, OhioLINK

What can we do to continue to meet the needs of students and faculty through the print to electronic book transition? Are there any patterns or trends in their use that will help? Anecdotally we hear about users preferring print to electronic. How do we find data to support this and to help them?

They cleaned up the data using Excel and OpenRefine, and then used Tableau for the analysis and visualization. OpenRefine is good for really messy data.

 

A Brief History of PaperStats
Speaker: Whitney Bates-Gomez, Memorial Sloan Kettering Cancer Center

Web-based tool for generating cost-per-use reports. It’s currently in beta and only working with JR1 reports. It works most of the time for COUNTER and SUSHI reports, but not always. The costs function requires you to upload the costs in a CSV format, and they were able to get that data from their subscription agent.

But, too bad for you, it’s going away at the end of the spring, but there might be a revised version out there some day. It’s through PubGet and Copyright Clearance Center decided to not renew their support.

ER&L 2016: Agents of Change: The Ongoing Challenges of Managing E-books and Streaming Media

change
“change” by samantha celera

Presenters: Steven R. Harris and Molly Beisler, University of Nevada, Reno

Evolution doesn’t happen in slow increments. Moments of punctuations happen quite suddenly. Ebooks are kind of like that in the evolution of the book.

In 2005, they were putting all formats on one record, manually updating the electronic content. As the quantity of ebooks increased, and the various licensing terms expanded, they were struggling to maintain this. In 2008, they began batch loading large numbers of eresources materials, with one person maintaining QA and merging records.

Then discovery services came in like an asteroid during the dinosaur age. They finally shifted from single record to separate records. They began tracking/coding ebooks to distinguish DDA from purchased, and expanded the ERM to track SU and other terms. This also prompted another staff reorganization.

They developed workflows via Sharepoint for new eresources depending on what the thing was: subscriptions/standing orders, one-time purchases with annual fees, and one-time purchases without annual fees. The streaming video packages fit okay in this workflow.

Streaming media has more complex access and troubleshooting issues. Platform as are variable, plugins may not be compatible. There are also many different access models (DDA, EBA) and many come with short-term licenses. Feel like the organization structure can support them as they figure out how to manage these.

They use a LibAnswers queue to track the various eresources problems come up.

Reiteration of the current library technology climate for eresources, with various challenges. No solutions.

The future comes with new problems due to next-gen ILS and their workflow quirks. With the industry consolidation, will everybody’s products work well with each other or will it become like the Amazon device ecosystem? Changing acquisitions models are challenges for collection development.

Be flexible. Do change. Agents.

ER&L 2016: Lightning Talks

Eric Frierson, EBSCO
Demoing a mobile interface called “Launch Pad”. Search results you can swipe left/right for the things you want, and then email the ones you want. The email includes persistent links, as well as other search suggestions and possibly a tutorial video as a kick-starter for the project you began a preliminary search on. The plan is that the code will eventually be released on GitHub.

Letitia Mukherjee, Elsevier
ScienceDirect APIs for institutional repositories, which helps with the metadata and embargoes for hosting final versions of articles. Looking for pilot institutions for embedding accepted manuscripts.

Bonnie Tijerina, IdeaDrop
Looking to take what they’ve been doing for the past four years and expand it. Would like support via the NewsChallenge.org site.

?, ?
Developed an “adapter” for digital files that can be retrieved and played/viewed on an open source tool.

Todd Carpenter, NISO
There is a standard for technical reports which is entirely print-based. Need people who are interested in taking this specification and modernize it, but no one has stepped up, yet. There is another standard on something something mono-lingual controlled vocabularies — linked data — but the name isn’t getting the volunteers in place to revise the 2005 revision to modern practices and technologies. If you are interested in technology, revise the old standards!

Todd Carpenter’s beard, NISO
Looking to develop a standard/best practice for text and data mining. Another project regarding the ebook reading experience and how libraries can manage that through API. Another thing about sharing human subjects’ data while maintaining privacy. Open to other ideas about things/problems/issues that need to be resolved.

Lana Zental, California Digital Library
Hiring a new data analyst.

Kate Sudowsky, ?
Plug for Usus.

ER&L 2016: Trying Something New: Examining Usage on the Macro and Micro Levels in the Sciences

Cheaper by the yard
“Cheaper by the yard” by Bill Smith

Speakers: Krystie (Klahn) Wilfon, Columbia University; Laura Schimming and Elsa Anderson, Icahn School of Medicine at Mount Sinai

Columbia has reduced their print collection in part due to size, but more because their users prefer electronic collections. Wilfon has employed a systematic collection of cost and data over time, a series of analysis templates based on item type and data source, and an organized system of distributing the end product. [She uses similar kinds of metrics I use in my reports, but far more data-driven and detailed. She’s only done this for two years, so I’m not sure how sustainable this is. I know how much time my own reports take each month, and I don’t think I would have the capacity to add more data to them.]

Mount Sinai had a lot of changes in 2013 that changed their collection development practices. They wanted to assess the resources they have, but found that traditional metrics were problematic. Citation counts don’t factor in the resources used but not cited; journal impact factors have their own issues; etc. They wanted to include altmetrics in the assessment, as well. They ended up using Altmetrics Explorer.

Rather than looking at CPU for the journal package as a whole, she broke it up by journal title and also looked at the number of articles published per title as a percentage of the whole. This is only one picture, though. Using Altmetric Explorer, they found that the newsletter in the package, while expensive in the cost per use, had a much higher median Altmetric score than the main peer reviewed journal in the package (score divided by the number of articles published in that year). So, for a traditional journal, citations and impact factor and COUNTER usage are important, but maybe for a newsletter type publication, altmetrics are more important. Also, within a single package of journal titles, there are going to be different types of journals. You need to figure out how to evaluate them without using the same stick.

ER&L 2016: COUNTER Point: Making the Most of Imperfect Data Through Statistical Modeling

score card
“score card” by AIBakker

Speakers: Jeannie Castro and Lindsay Cronk, University of Houston

Baseball statistics are a good place to start. There is over 100 years of data. Cronk was wishing that she could figure the WAR for eresources. What makes a good/strong resource? What indicators besides usage performance should we evaluate? Can statistical analysis tell us anything?

Castro suggested looking at the data as a time series. Cronk is not a statistician, so she relied on a lot of other folks who can do that stuff.

Statistical modeling is the application of a set of assumptions to data, typically paired data. There are several techniques that can be used. COUNTER reports are imperfect time series data sets. They don’t give us individual data points (day/time). They are clumped together by month, but aside from this, they are good for time series. There is equal spacing and time of consistently measured data points.

Decomposition provides a framework for segmented time series. Old data can be checked by newer data (i.e. 2010-2013 compared to 2014) without having to predict the future. Statistical testing is important in this. Exponential smoothing eliminates noise/outlier, and is very useful for anomalies in your COUNTER data due to access issues or unusual spikes.

Cronk really wanted to look at something other than cost/use, which was part of the motivation to do this. Usage by collection portion size is another method touted by Michael Levine-Clark. She needed 4+ years usage history for reverse predictive analysis. Larger numbers make analysis easier, so she went with large aggregator databases for DB and some large journal packages for JR.

She used Excel for data collection and clean-up, R (studio) for data analysis, and Tableau (public) for data visualization. R studio is a lot more user-friendly than the desktop. There are canned analysis packages that will do the heavy lifting. (There was a recommendation forRyan Womack’s video series for learning how to use R.) Tableau helped with visualization of the data, including some predictive indicators. We cannot see trends ourselves, so these visualization can help us make decisions. Usage can be predicted based on the past, she found.

They found that usage over time is consistent across the vendor platforms (for journal usage), even though some were used more than others.

The next level she looked at was the search to session ratio for databases. What is the average? Is that meaningful? When we look at usage, what is the baseline that would help us determine if this database is more useful than another? Downward trends might be indicators of outside factors.

ER&L 2016: Finding Time: From Industrial Mythology to Chronemic Literacy

Time
“Time” by cea +

Speaker: Dawna Ballard, Moody College of Communication, University of Texas at Austin

She studies human interaction, particularly the symbols we use in communication. She studies the lived experience of time beyond what is on the clock.

Time has been called the “silent language”. What we see is the tip of the iceberg. It’s not only non-verbal (looking at watch, tapping toes), it’s also full of deeply hidden assumptions that are masked. These hidden assumptions are often seen as truth, and each culture has their own interpretation/approach. We need to develop a chronemic literacy.

Industrial time is visible through the clock. There are a lot of hidden assumptions behind that. For one, it’s not even in tune with the Earth (see also: leap year). Three basic hidden assumptions of industrial time are that people work a lot like machines, that all times are the same, and that we can control the people and events around us. We think this is the way it’s always been, but in fact there are many other ways to orienting to time. This is the chronos aspect of orienting to time.

Pre-industrial and post-industrial time have more in common with each other than with industrial time. Pre-industrial time was based on “the event” (i.e. farming). Assumptions: people work nothing like machines, all times are not the same, and life unfolds through the people and events around us. This is the kairos aspect of time.

The industrial mythology comes with three related myths.

The first one is that better time management skills and tools will make you more productive — the right app will change your life. Time-management originated with factory work, and was wildly successful in that environment. It doesn’t function so well in the office work of today. The reality is that time management is not related to productivity. All it does is help you feel that things are being managed, which is good if that makes you feel happier about your work. It will not solve your time problems. Productivity is a long-term proposition — what is sustainable for you?

The second one is that if you love what you’re doing, it doesn’t feel like work. (“That’s bullshit.”) Be wary of language that tries to mask work as something else. There are still human limits to work, and no matter how much you enjoy it, you can’t do it all the time forever. Focusing on balance can create unending frustration. Lower-wage workers often don’t even know what this means, or assume it’s just for managers.

Thirdly, focusing on work-life balance will solve your problems. Balance is something that machines do, and it doesn’t really apply to human beings. Work and life as separate terms doesn’t appear until the 1960s, and it was about industrial work. Life should be in our work — our lives are a lot of work. We think that if we can find work/life balance, we think our lives will be centered and at peace. Work has never looked or felt like that. We end up holding on to one or two things that are “necessary”, usually work, and getting that done to the detriment of the others.

Consider alignment. Being mindful of our alignment is being aware of all the interrelated parts that are needed to move forward. When all the parts work together, we get an efficiency of movement. We cannot let something stay out of alignment for too long without expecting repercussions. We can get help from experts (therapists) and support networks (family/friends), and it’s important for our long-term sustainability.

What are your hidden assumptions? What are the things you are thinking about or not? What are the things you believe that are shaping your hidden assumptions? What might be impeding the alignment you would like to achieve?

NASIG 2015 – Building a Social Compact for Preserving E-Journals

locksscat
LOCKSS Cat

Speaker: Anne Kenney, University Librarian, Cornell University

30 years ago when NASIG began, we wouldn’t have been worrying about the preservation of ejournals. But we do. The digital-first ecology disrupts traditional roles and responsibilities, with publishers now being responsible for preserving the journal content rather than libraries. We’re still trying to figure out how to manage the preservation and access.

60% of Cornell’s collections expenditures goes to online resources. An Ithaka survey shows that most institution types spend more on online and ongoing resources than on any other collection format. The same survey found that Doctoral institutions are far more interested in preservation of online materials than Masters and Baccalaureate schools.

A study of library directors identified several issues for libraries:

  • sense of urgency
  • need for trusted independent archiving
  • content coverage and access conditions
  • resource commitment and competing priorities
  • need for collective response

There was a need for a non-profit preservation program separate from publisher projects, with a focus on scholarly journals. Portico, Scholar’s Portal, and CLOCKSS are the three main programs still existing that meet the needs of ejournal preservation. They are being supported to varying degrees by ARLs.

The coverage in these three programs is uneven and it’s difficult to create a definitive list. The major publishers are represented, and there is significant duplication across the services. She’s not losing sleep over the preservation of Elsevier journals, for example. STM literature in English is very well preserved.

The Keepers Registry attempts to gather and maintain digital content information from repositories archiving ejournals. KBART could be useful for keeping this data clean and updated.

2CUL did a study in 2011 to see how well their content was being preserved in LOCKSS and/or Portico, and only 13-16% of their titles were preserved. Most are those that have ISSNs or eISSNs, which is only about half of the titles held by the schools. They expanded to include Duke in 2012 and looked at all the preservation sources in the Keepers Registry. Only 23-27% of the ejournals with e/ISSNs were preserved, and there was considerable redundancy across the preservation programs.

Vulnerable content that is not being preserved well includes third-party content, aggregator content, small publishers, open access titles, and historical publications. They are looking to create some best practices for OA journal preservation.

The preservation programs need better coordination to identify what redundancy is necessary and how to incorporate more unique content. Right now, they see themselves more as competitors than collaborators, and that needs to change.

All of the scholarly record must be preserved, and it’s the responsibility of everyone in the scholarly communication world, not just libraries. Much of the content is at risk and no one can do this alone. No single archiving program will meet all needs, and they need more transparency and cooperation. License terms are inadequate for most preservation needs, and maybe we need legislation to cover legal deposits. We need clearer and broader triggers for when we can access the preserved content (there is a concern for the long-term financial sustainability of a dark archive).

Libraries need to acknowledge this is a shared responsibility, regardless of size and type of library. Publishers are not the enemies in this domain. Participate in at least one initiative. Move beyond a registry of archived publications to identify at-risk materials critical to scholarship.

Publishers need to enter into relationships with one or more ejournal archiving programs. Provide adequate information and data to archivers on coverage. Extend liberal archiving rights in license agreements, and consider new terms for access.

Archiving programs need to expand coverage to include vulnerable materials. Be explicit about coverage, and secure access rights.

NASIG can raise awareness of this issue. Endorse power of collective action. Consider a set of principles and actions, such as the KBART standard and revising SERU to include better terms for archiving. Foment international cooperation with other organizations and funding bodies.

NASIG 2015 – Somewhere To Run To, Nowhere To Hide

info free fridge
information wants to be free?

Speaker: Stephen Rhind-Tutt, President, Alexander Street Press

His perspective is primary source collections, mostly video and audio, created by a small company of 100 or so people.

There are billions and trillions of photos, videos, and audio files being added to the Internet every year, and it’s growing year over year. We’re going to need a bigger boat.

He reviewed past presentations at NASIG, and there are reoccurring nightmares of OA replacing publishers, Wikipedia replacing reference sources, vendors will bypass libraries and go direct to faculty, online learning will replace universities, etc.

All technologies evolve and die. Many worry about the future, many hold onto the past, and we’re not responding quickly enough to the user. Dispense with the things that are less relevant. Users don’t want to search, they want to find.

You can project the future, and not just by guessing. You don’t have to know how it’s going to happen, but you can look at what people want and project from that.

Even decades after the motor car was developed, we were still framing it within the context and limitations of the horse-drawn carriage. We’re doing that with our ebooks and ejournals today. If we look to the leaders in the consumer space, we can guess where the information industry is heading.

If we understand the medium, we can understand how best to use it. Louis Kahn says, “Honor the material you use.” The medium of electronic publications favors small pieces (articles, clips) and is infinitely pliable, which means it can be layered and made more complex. Everything is interconnected with links, and the links are more important than the destination. We are fighting against the medium when we put DRM on content, limit the simultaneous use, and hide the metadata.

“I don’t know how long it will take, but I truly believe information will become free.”

Video is a terrible medium for information if you want it fast — 30 min of video can be read in 5 minutes. ASP has noticed that the use of the text content is on par with the use of the associated video content.

Mobile is becoming very important.

Linking — needs to work going out and coming in. The metadata for linking must be made free so that it can be used broadly and lead users to the content.

The researcher wants every piece of information created on every topic for free. From where he is as a publisher, he’s seeing better content moving more and more to open access. And, as a result of that, ASP is developing an open music library that will point to both fee and free content, to make it shareable with other researchers.

In the near future, publishers will be able to make far more money developing the research process ecosystem than by selling one journal.

NASIG 2015 – Ain’t Nobody’s Business If I Do (Read Serials)

Speaker: Dorothea Salo, Faculty Associate, University of Wisconsin – Madison

Publishers and providers are collecting massive amounts of user data, and Salo is not happy about this. ALA code of ethics is not happy about this, either.

Why does privacy matter?

The gizmos that have ticked along for ages without being connected are now connected to the internet. It can be very handy, like smart thermostats, or a little too snoopy like the smart TV that listens in on your conversations. The FTC is scrutinizing the Internet of Things very closely, because it’s easy to cause some real harm with the data from these devices.

Thermostat data, for example, tells you a lot about when someone is at home or not, which can be useful for thieves, law enforcement, and marketers. And this is information that wasn’t available when the thermostat was offline.

Eresource use is being snooped on, too. Adobe is collecting reader behavior information from Adobe Digital Editions, even when it’s coming from library sources. They got caught because they were transmitting that information unencrypted, which they fixed, but they aren’t not doing it anymore.

Readers cannot trust content providers. Librarians cannot trust content providers. We have to assume you’re behaving like Adobe, until you prove otherwise. It’s easy, then, to lump eresources into the Internet of Things. Back in the day, journals and books weren’t online, but now they are ways to collect data on reader behavior.

Generally speaking, content providers have very little out there in a code of practice for reader privacy, including the relevant associations. Not even the open access publications and associations. Most journal privacy policies do not measure up to library standards, including those that are OA. 16 of the top 20 research journals let ad networks track readers.

There’s no conspiracy theory here. It’s mostly accidental. In the age of print, reader privacy wasn’t an issue. Readers could do whatever they wanted with the content. Content providers need to address this now that they are capable of collecting and using all sorts of data they couldn’t before.

NISO is working on a framework for this, and the NASIG community needs to be engaged.

The ALA code of ethics doesn’t say that you shouldn’t collect data when it’s convenient — there are no exceptions. Same goes for “improving services”.

The question, “Would we do this in a physical space with people around us?” is a useful gague of the creep factor. Physical library users and digital library users should have the same privacy rights.

It’s easy to feel helpless in this. It’s easy to give up and think no user cares about their privacy. Just because it’s easy and convenient to ignore privacy, that doesn’t make it right.

Libraries and content providers need to live up to Article III of the ALA Code of Ethics: “…protect each reader’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”

How do we do this? Understand the risks and mitigate them. Risks: personally identifying information (sometimes this is used as a smoke screen to hide what is being collected when this is not), long tail information (uncommon enough to identify individuals, even without PII), and behavior trails (highly specific time stamps, etc.). Libraries deal with this by tracking the stuff instead of the people. Libraries keep proxy server logs only long enough to identify use that violates TOS.

Determine who wants to know and why: data omnivores (NSA, Google, Facebook), data opportunists (academic researchers, usability wonks, assessment experts, readers who want to reuse their own data), and data paparazzi (doxxers, stalkers, politicians). Worry less about the opportunists and omnivores, worry a lot about the paparazzi.

What should we do or not do? No ostriching — heads out of the sand, please. The Library Freedom Project has lots of resources. Industry-level advocacy is needed — those who take the high road on privacy is afraid of being out-competed by those who don’t.

We’re not helpless. Don’t give up. License negotiation time is when we can ask the hard questions — use our Benjamins wisely. Assess mindfully, being aware of data leakage and compromised privacy.

Not even the greediest data omnivore, the most clueless data opportunist, or the most evil data paparazzi can abuse data that isn’t there. Don’t collect reader data unless there is a clear and reasonable reason to do it.

css.php