reason #237 why JSTOR rocks

For almost two decades, JSTOR has been digitizing and hosting core scholarly journals across many disciplines. Currently, their servers store more than 1,400 journals from the first issue to a rolling wall of anywhere from 3-5 years ago (for most titles). Some of these journals date back several centuries.

They have backups, both digital and virtual, and they’re preserving metadata in the most convertible/portable formats possible. I can’t even imagine how many servers it takes to store all of this data. Much less how much it costs to do so.

And yet, in the spirit of “information wants to be free,” they are making the pre-copyright content open and available to anyone who wants it. That’s stuff from before 1923 that was published in the United States, and 1870 for everything else. Sure, it’s not going to be very useful for some researchers who need more current scholarship, but JSTOR hasn’t been about new stuff so much as preserving and making accessible the old stuff.

So, yeah, that’s yet another reason why I think JSTOR rocks. They’re doing what they can with an economic model that is responsible, and making information available to those who can’t afford it or are not affiliated with institutions that can purchase it. Scholarship doesn’t happen in a vacuum, and  innovators and great minds aren’t always found solely in wealthy institutions. This is one step towards bridging the economic divide.

NASIG 2010 reflections

When I was booking my flights and sending in my registration during the snow storms earlier this year, Palm Springs sounded like a dream. Sunny, warm, dry — all the things that Richmond was not. This would also be my first visit to Southern California, so I may be excused for my ignorance of the reality, and more specifically, the reality in early June. Palm Springs was indeed sunny, but not as dry and far hotter than I expected.

Despite the weather, or perhaps because of the weather, NASIGers came together for one of the best conferences we’ve had in recent years. All of the sessions were held in rooms that emptied out into the same common area, which also held the coffee and snacks during breaks. The place was constantly buzzing with conversations between sessions, and many folks hung back in the rooms, chatting with their neighbors about the session topics. Not many were eager to skip the sessions and the conversations in favor of drinks/books by the pools, particularly when temperatures peaked over 100°F by noon and stayed up there until well after dark.

As always, it was wonderful to spend time with colleagues from all over the country (and elsewhere) that I see once a year, at best. I’ve been attending NASIG since I was a wee serials librarian in 2002, and this conference/organization has been hugely instrumental in my growth as a librarian. Being there again this year felt like a combination of family reunion and summer camp. At one point, I choked up a little over how much I love being with all of them, and how much I was going to miss them until we come together again next year.

I’ve already blogged about the sessions I attended, so I won’t go into those details so much here. However, there were a few things that stood out to me and came up several times in conversations over the weekend.

One of the big things is a general trend towards publishers handling subscriptions directly, and in some cases, refusing to work with subscription agents. This is more prevalent in the electronic journal subscription world than in print, but that distinction is less significant now that so many libraries are moving to online-only subscriptions. I heard several librarians express concern over the potential increase in their workload if we go back to the era of ordering directly from hundreds of publishers rather than from one (or a handful) of subscription agents.

And then there’s the issue of invoicing. Electronic invoices that dump directly into a library acquisition system have been the industry standard with subscription agents for a long time, but few (and I can’t think of any) publishers are set up to deliver invoices to libraries using this method. In fact, my assistant who processes invoices must manually enter each line item of a large invoice of one of our collections of electronic subscriptions every year, since this publisher refuses to invoice through our agent (or will do so in a way that increases our fees to the point that my assistant would rather just do it himself). I’m not talking about mom & pop society publisher — this is one of the major players. If they aren’t doing EDI, then it’s understandable that librarians are concerned about other publishers following suit.

Related to this, JSTOR and UC Press, along with several other society and small press publishers have announced a new partnership that will allow those publishers to distribute their electronic journals on the JSTOR platform, from issue one to the current. JSTOR will handle all the hosting, payments, and library technical support, leaving the publishers to focus on generating the content. Here’s the kicker: JSTOR will also be handling billing for print subscriptions of these titles.

That’s right – JSTOR is taking on the role of subscription agent for a certain subset of publishers. They say, of course, that they will continue to accept orders through existing agents, but if libraries and consortia are offered discounts for going directly to JSTOR, with whom they are already used to working directly for the archive collections, then eventually there will be little incentive to use a traditional subscription agent for titles from these publishers. On the one hand, I’m pleased to see some competition emerging in this aspect of the serials industry, particularly as the number of players has been shrinking in recent years, but on the other hand I worry about the future of traditional agents.

In addition to the big picture topics addressed above, I picked up a few ideas to add to my future projects list:

  • Evaluate the “one-click” rankings for our link resolver and bump publisher sites up on the list. These sources “count” more when I’m doing statistical reports, and right now I’m seeing that our aggregator databases garner more article downloads than from the sources we pay for specifically. If this doesn’t improve the stats, then maybe we need to consider whether or not access via the aggregator is sufficient. Sometimes the publisher site interface is a deterrent for users.
  • Assess the information I currently provide to liaisons regarding our subscriptions and discuss with them what additional data I could incorporate to make the reports more helpful in making collection development decisions. Related to this is my ongoing project of simplifying the export/import process of getting acquisitions data from our ILS and into our ERMS for cost per use reports. Once I’m not having to do that manually, I can use that time/energy to add more value to the reports.
  • Do an inventory of our holdings in our ERMS to make sure that we have turned on everything that should be turned on and nothing that shouldn’t. I plan to start with the publishers that are KBART participants and move on from there (and yes, Jason Price, I will be sure to push for KBART compliance from those who are not already in the program).
  • Begin documenting and sharing workflow, SQL, and anything else that might help other electronic resource librarians who use our ILS or our ERMS, and make myself available as a resource. This stood out to me during the user group meeting for our ERMS, where I and a handful of others were the experts of the group, and by no means do I feel like an expert, but clearly there are quite a few people who could learn from my experience the way I learned from others before me.

I’m probably forgetting something, but I think those are big enough to keep me busy for quite a while.

If you managed to make it this far, thanks for letting me yammer on. To everyone who attended this year and everyone who couldn’t, I hope to see you next year in St. Louis!

NASIG 2010: It’s Time to Join Forces: New Approaches and Models that Support Sustainable Scholarship

Presenters: David Fritsch, JSTOR and Rachel Lee, University of California Press

JSTOR has started working with several university press and other small scholarly publishers to develop sustainable options.

UC Press is one of the largest university press in the US (36 journals in the humanities, biological & social sciences), publishing both UC titles and society titles. Their prices range from $97-422 for annual subscriptions, and they are SHERPA Green. One of the challenges they face on their own platform is keeping up with libraries expectations.

ITHAKA is a merger of JSTOR, ITHAKA, Portico, and Alkula, so JSTOR is now a service rather than a separate company. Most everyone here knows what the JSTOR product/service is, and that hasn’t changed much with the merger.

Scholar’s use of information is moving online, and if it’s not online, they’ll use a different resource, even if it’s not as good. And, if things aren’t discoverable by Google, they are often overlooked. More complex content is emerging, including multimedia and user-generated content. Mergers and acquisitions in publishing are consolidating content under a few umbrellas, and this threatens smaller publishers and university presses that can’t keep up with the costs on a smaller scale.

The serials crisis has impacted smaller presses more than larger ones. Despite good relationships with societies, it is difficult to retain popular society publications when larger publishers can offer them more. It’s also harder to offer the deep discounts expected by libraries in consortial arrangements. University presses and small publishers are in danger of becoming the publisher of last resort.

UC Press and JSTOR have had a long relationship, with JSTOR providing long-term archiving that UC Press could not have afforded to maintain on their own. Not all of the titles are included (only 22), but they are the most popular. They’ve also participated in Portico. JSTOR is also partnering with 18 other publishers that are mission-driven rather than profit-driven, with experience at balancing the needs of academia and publishing.

By partnering with JSTOR for their new content, UC Press will be able to take advantage of the expanded digital platform, sales teams, customer service, and seamless access to both archive and current content. There are some risks, including the potential loss of identity, autonomy, and direct communication with libraries. And then there is the bureaucracy of working within a larger company.

The Current Scholarship Program seeks to provide a solution to the problems outlined above that university presses and small scholarly publishers are facing. The shared technology platform, Portico preservation, sustainable business model, and administrative services potentially free up these small publishers to focus on generating high-quality content and furthering their scholarly communication missions.

Libraries will be able to purchase current subscriptions either through their agents or JSTOR (who will not be charging a service fee). However, archive content will be purchased directly from JSTOR. JSTOR will handle all of the licensing, and current JSTOR subscribers will simply have a rider adding title to their existing licenses. For libraries that purchase JSTOR collections through consortia arrangements, it will be possible to add title by title subscriptions without going through the consortia if a consortia agreement doesn’t make sense for purchase decisions. They will be offering both single-title purchases and collections, with the latter being more useful for large libraries, consortia, and those who want current content for titles in their JSTOR collections.

They still don’t know what they will do about post-cancellation access. Big red flag here for potential early adopters, but hopefully this will be sorted out before the program really kicks in.

Benefits for libraries: reasonable pricing, more efficient discovery, single license, and meaningful COUNTER-compliant statistics for the full run of a title. Renewal subscriptions will maintain access to what they have already, and new subscriptions will come with access to the first online year provided by the publisher, which may not be volume one, but certainly as comprehensive as what most publishers offer now.

UC Press plans to start transitioning in January 2011. New orders, claims, etc. will be handled by JSTOR (including print subscriptions), but UC Press will be setting their own prices. Their platform, Caliber, will remain open until June 30, 2011, but after that content will be only on the JSTOR platform. UC Press expects to move to online-only in the next few years, particularly as the number of print subscriptions are dwindling to the point where it is cost-prohibitive to produce the print issues.

There is some interest from the publishers to add monographic content as well, but JSTOR isn’t ready to do that yet. They will need to develop some significant infrastructure in order to handle the order processing of monographs.

Some in the audience are concerned that the cost of developing platform enhancements and other tools, mostly that these costs will be passed on in subscription prices. They will be, to a certain extent, only in that the publishers will be contributing to the developments and they set the prices, but because it is a shared system, the costs will be spread out and likely impact libraries no more than they have already.

One big challenge all will face is unlearning the mindset that JSTOR is only archive content and not current content.

Ithaka’s What to Withdraw tool

Have you seen the tool that Ithaka developed to determine what print scholarly journals you could withdraw (discard/store) that are already in your digital collections? It’s pretty nifty for a spreadsheet. About 10-15 minutes of playing with it and a list of our print holdings resulted in giving me a list of around 200 or so actionable titles in our collection, which I passed on to our subject liaison librarians.

The guys who designed it are giving some webinar sessions, and I just attended one. Here are my notes, for what it’s worth. I suggest you participate in a webinar if you’re interested in it. The next one is tomorrow and there’s one on February 10th as well.


  • They have an organizational commitment to preservation: JSTOR, Portico, and Ithaka S+R
  • Libraries are under pressure to both decrease their print collections and to maintain some print copies for the library community as a whole
  • Individual libraries are often unable to identify materials that are sufficiently well-preserved elsewhere
  • The What to Withdraw framework is for general collections of scholarly journals, not monographs, rare books, newspapers, etc.
  • The report/framework is not meant to replace the local decision-making process

What to Withdraw Framework

  • Why do we need to preserve the print materials once we have a digital version?
    • Fix errors in the digital versions
    • Replace poor quality scans or formats
    • Inadequate preservation of the digital content
    • Unreliable access to the digital content
    • Also, local politics or research needs might require access to or preservation of the print
  • Once they developed the rationales, they created specific preservation goals for each category of preservation and then determined the level of preservation needed for each goal.
    • Importance of images in journals (the digitization standards for text is not the same as for images, particularly color images)
    • Quality of the digitization process
    • Ongoing quality assurance processes to fix errors
    • Reliability of digital access (business model, terms & conditions)
    • Digital preservation
  • Commissioned Candace Yano (operations researcher at UC Berkeley) to develop a model for copies needed to meet preservation goals, with the annual loss rate of 0.1% for a dark archive.
    • As a result, they found they needed only two copies to have a >99% confidence than they will still have remaining copies left in twenty years.
    • As a community, this means we need to be retaining at least two copies, if not more.

Decision-Support Tool (proof of concept)

  • JSTOR is an easy first step because many libraries have this resource and many own print copies of the titles in the collections and Harvard & UC already have dim/dark archives of JSTOR titles
  • The tool provides libraries information to identify titles held by Harvard & UC libraries which also have relatively few images

Future Plans

  • Would like to apply the tool to other digital collections and dark/dim archives, and they are looking for partners in this
  • Would also like to incorporate information from other JSTOR repositories (such as Orbis-Cascade)


One of the big projects I’ve been working on at MPOW is preparing to shift the bound journal collection, which also includes some systematic deselection. I don’t mean cancelling subscriptions. I’m talking about weeding the journals.

We’re about to run out of space in the building with no prospects of anything new on the horizon, so for the first time in forty years, the books are being weeded. The same thing has to happen to the journals, or we’ll be out of room for them, too. As it is, some areas are so tight that several sections of a range need to be shifted in order to add a new bound volume.

We started by pulling everything that is in JSTOR. This has freed up some significant space, but there is still a bit of dead wood in the collection. With online access, we’ve noticed a precipitous drop in print usage. Whereas we use to have an entire range of shelving for reshelve-prep, we now use a single book truck, which is rarely filled. Sure, we still need the journals that are not online in some fashion, but our students would prefer to use the electronic journals with free printing than get up from the computer, find the volume, and make a not-free photocopy of an article.

Sometimes I wonder why we continue to buy print journals at all, and the answer usually is that the publisher doesn’t have a good platform for their ejournals (if they have them), or for whatever reason, they seem kind of sketchy. Still, we’ve made a lot of transitions to online only in the past couple of years, and I think that will pan out well for slowing the collection growth to maximum capacity.