NASIG 2010: What Counts? Assessing the Value of Non-Text Resources

Presenters: Stephanie Krueger, ARTstor and Tammy S. Sugarman, Georgia State University

Anyone who does anything with use statistics or assessment knows why use statistics are important and the value of standards like COUNTER. But, how do we count the use of non-text content that doesn’t fit in the categories of download, search, session, etc.? What does it mean to “use” these resources?

Of the libraries surveyed that collect use stats for non-text resources, they mainly use them to report to administrators and determine renewals. A few use it to evaluate the success of training or promote the resource to the user community. More than a third of the respondents indicated that the stats they have do not adequately meet the needs they have for the data.

ARTstor approached COUNTER and asked that the technical advisory group include representatives from vendors that provide non-text content such as images, video, etc. Currently, the COUNTER reports are either about Journals or Databases, and do not consider primary source materials. One might think that “search” and “sessions” would be easy to track, but there are complexities that are not apparent.

Consider the Database 1 report. With a primary source aggregator like ARTstor, who is the “publisher” of the content? For ARTstor, search is only 27% of the use of the resource. 47% comes from image requests (includes thumbnail, full-size, printing, download, etc.) and the rest is from software utilities within the resource (creation of course folders, passwords creation, organizing folders, annotations of images, emailing content/URLs, sending information to bibliographic management tools, etc.).

The missing metric is the non-text full content unit request (i.e. view, download, print, email, stream, etc.). There needs to be some way of measuring this that is equivalent to the full-text download of a journal article. Otherwise, cost per use analysis is skewed.

What is the equivalent of the ISSN? Non-text resources don’t even have DOIs assigned to them.

On top of all of that, how do you measure the use of these resources beyond the measurable environment? For example, once an image is downloaded, it can be included in slides and webpages for classroom use more than once, but those uses are not counted. ARTstor doesn’t use DRM, so they can’t track that way.

No one is really talking about how to assess this kind of usage, at least not in the professional library literature. However, the IT community is thinking about this as well, so we may be able to find some ideas/solutions there. They are being asked to justify software usage, and they have the same lack of data and limitations. So, instead of going with the traditional journal/database counting methods, they are attempting to measure the value of the services provided by the software. The IT folk identify services, determine the cost of those services, and identify benchmarks for those costs.

A potential report could have the following columns: collection (i.e. an art collection within ARTstor, or a university collection developed locally), content provider, platform, and then the use numbers. This is basic, and can increase in granularity over time.

There are still challenges, even with this report. Time-based objects need to have a defined value of use. Resources like data sets and software-like things are hard to define as well (i.e. SciFinder Scholar). And, it will be difficult to define a report that is one size fits all.

Ithaka’s What to Withdraw tool

Have you seen the tool that Ithaka developed to determine what print scholarly journals you could withdraw (discard/store) that are already in your digital collections? It’s pretty nifty for a spreadsheet. About 10-15 minutes of playing with it and a list of our print holdings resulted in giving me a list of around 200 or so actionable titles in our collection, which I passed on to our subject liaison librarians.

The guys who designed it are giving some webinar sessions, and I just attended one. Here are my notes, for what it’s worth. I suggest you participate in a webinar if you’re interested in it. The next one is tomorrow and there’s one on February 10th as well.


Background

  • They have an organizational commitment to preservation: JSTOR, Portico, and Ithaka S+R
  • Libraries are under pressure to both decrease their print collections and to maintain some print copies for the library community as a whole
  • Individual libraries are often unable to identify materials that are sufficiently well-preserved elsewhere
  • The What to Withdraw framework is for general collections of scholarly journals, not monographs, rare books, newspapers, etc.
  • The report/framework is not meant to replace the local decision-making process

What to Withdraw Framework

  • Why do we need to preserve the print materials once we have a digital version?
    • Fix errors in the digital versions
    • Replace poor quality scans or formats
    • Inadequate preservation of the digital content
    • Unreliable access to the digital content
    • Also, local politics or research needs might require access to or preservation of the print
  • Once they developed the rationales, they created specific preservation goals for each category of preservation and then determined the level of preservation needed for each goal.
    • Importance of images in journals (the digitization standards for text is not the same as for images, particularly color images)
    • Quality of the digitization process
    • Ongoing quality assurance processes to fix errors
    • Reliability of digital access (business model, terms & conditions)
    • Digital preservation
  • Commissioned Candace Yano (operations researcher at UC Berkeley) to develop a model for copies needed to meet preservation goals, with the annual loss rate of 0.1% for a dark archive.
    • As a result, they found they needed only two copies to have a >99% confidence than they will still have remaining copies left in twenty years.
    • As a community, this means we need to be retaining at least two copies, if not more.

Decision-Support Tool (proof of concept)

  • JSTOR is an easy first step because many libraries have this resource and many own print copies of the titles in the collections and Harvard & UC already have dim/dark archives of JSTOR titles
  • The tool provides libraries information to identify titles held by Harvard & UC libraries which also have relatively few images

Future Plans

  • Would like to apply the tool to other digital collections and dark/dim archives, and they are looking for partners in this
  • Would also like to incorporate information from other JSTOR repositories (such as Orbis-Cascade)