February 2010 – Page 2 – eclectic librarian

ER&L 2010: ERMS Success – Harvard’s experience implementing and using an ERM system

Speaker: Abigail Bordeaux

Harvard has over 70 libraries and they are very decentralized. This implementation is for the central office that provides the library systems services for all of the libraries. Ex Libris is their primary vendor for library systems, include the ERMS, Verde. They try to go with vended products and only develop in-house solutions if nothing else is available.

Success was defined as migrating data from old system to new, to improve workflows with improved efficiency, more transparency for users, and working around any problems they encountered. They did not expect to have an ideal system – there were bugs with both the system and their local data. There is no magic bullet. They identified the high-priority areas and worked towards their goals.

Phase I involved a lot of project planning with clearly defined goals/tasks and assessment of the results. The team included the primary users of the system, the project manager (Bordeaux), and a programmer. A key part of planning includes scoping the project (Bordeaux provided a handout of the questions they considered in this process). They had a very detailed project plan using Microsoft Project, and at the very least, the listing out of the details made the interdependencies more clear.

The next stage of the project involved data review and clean-up. Bordeaux thinks that data clean-up is essential for any ERM implementation or migration. They also had to think about the ways the old ERM was used and if that is desirable for the new system.

The local system they created was very close to the DLF recommended fields, but even so, they still had several failed attempts to map the fields between the two systems. As a result, they had a cycle of extracting a small set of records, loading them into Verde, reviewing the data, and then delete the test records out of Verde. They did this several times with small data sets (10 or so), and when they were comfortable with that, they increased the number of records.

They also did a lot of manual data entry. They were able to transfer a lot, but they couldn’t do everything. And some bits of data were not migrated because of the work involved compared to the value of it. In some cases, though, they did want to keep the data, so they entered it manually. Part of what they did to visualize the mapping process, they created screenshots with notes that showed the field connections.

Prior to this project, they were not using Aleph to manage acquisitions. So, they created order records for the resources they wanted to track. The acquisitions workflow had to be reorganized from the ground up. Oddly enough, by having everything paid out of one system, the individual libraries have much more flexibility in spending and reporting. However, it took some public relations work to get the libraries to see the benefits.

As a result of looking at the data in this project, they got a better idea of gaps and other projects regarding their resources.

Phase two began this past fall to begin incorporating the data from the libraries that did not participate in phase one. They now have a small group with folks representing the libraries. This group is coming up with best practices for license agreements and entering data into the fields.

ER&L 2010: Electronic Access and Research Efficiencies – Some Preliminary Findings from the U of TN Library’s ROI Analysis

Speaker: Gayle Baker, University of Tennessee – Knoxville

Phase one: Demonstrate the role of the library information in generating research grand incomes for the institution (i.e. the university spends X amount of money on the library which generates X amount of money in grant research and support).

To do this, they sent out emails to faculty with surveys that included incentives to respond (quantitative and qualitative questions). They gathered university-supplied data about grant proposals and income, and included library budget information. They also interviewed administrators to get a better picture of the priorities of the university.

UIUC’s model: Faculty with grant proposals using the library times the percentage of award success rate times the average grant income, then multiplied that by the grants expended and divided by the total library budget. The end result was that the model showed $4.38 grant income for every dollar invested in the library.

Phase two: Started by testing UIUC’s methodology across eight institutions in eight countries. Speaker didn’t elaborate, but went on to describe the survey they used and examples of survey responses. Interesting, but hard to convey relevance in this format, particularly since it’s so dependent on individual institutions. (On the up side, she has amusing anecdotes.) They used the ROI formula suggested by Carol Tenopir, which is slightly different than described above.

Phase three: IMLS grant for the next three years, headed by Tenopir and Paula Kaufman, and ARL and Syracuse will also be involved. They are trying to put a dollar value on things that are hard to do, such as student retention and success.

ER&L 2010: We’ve Got Data – Now What Do We Do With It? Applying Standards to Assess Information Resources

Speakers: Mary Feeney, Ping Situ, and Jim Martin

They had a budget cut (surprise surprise), so they had to asses what to cut using the data they had. Complicating this was a change in organizational structure. In addition, they adopted the BYU project management model. Also, they had to sort out a common approach to assessment across all of the disciplines/resources.

They used their ILLs to gather stats about print resource use. They hired Scholarly Stats to gather their online resource stats, and for publishers/vendors not in Scholarly Stats, they gathered data directly from the vendors/publishers. Their process involved creating spreadsheets of resources by type, and then divided up the work of filling in the info. Potential cancellations were then provided to interested parties for feedback.

Quality standards:

60% of monographs need to show at least one use in the last four years – this was used to apply cuts to the firm orders book budget, which impacts the flexibility for making one-time purchases with remaining funds and the book money was shifted to serial/subscription lines
95% of individual journal titles need to show use in the last three years (both in-house and full-text downloads) – LJUR data was used to add to the data collected about print titles
dual format subscriptions required a hybrid approach, and they compared the costs with the online-only model – one might think that switching to online only would be a no-brainer, but licensing issues complicate the matter
cost per use of ejournal packages will not exceed twice the cost of ILL articles

One problem with their approach was with the existing procedures that resulted in not capturing data about all print journals. They also need to include local document delivery requests in future analysis. They need to better integrate the assessment of the use of materials in aggregator databases, particularly since users are inherently lazy and will go the easiest route to the content.

Aggregator databases are difficult to compare, and often the ISSN lists are incomplete. And, it’s difficult to compare based on title by title holdings coverage. It’s useful for long-term use comparison, but not this immediate project. Other problems with aggregator databases include duplication, embargos, and completeness of coverage of a title. They used SerSol’s overlap analysis tool to get an idea of duplication. It’s a time-consuming project, so they don’t plan to continue with it for all of their resources.

What if you don’t have any data or the data you have doesn’t have a quality standard? They relied on subject specialists and other members of the campus to assess the value of those resources.

ER&L 2010: Usage Statistics for E-resources – is all that data meaningful?

Speaker: Sally R. Krash, vendor

Three options: do it yourself, gather and format to upload to a vendor’s collection database, or have the vendor gather the data and send a report (Harrassowitz e-Stats). Surprisingly, the second solution was actually more time-consuming than the first because the library’s data didn’t always match the vendor’s data. The third is the easiest because it’s coming from their subscription agent.

Evaluation: review cost data; set cut-off point ($50, $75, $100, ILL/DocDel costs, whatever); generate list of all resources that fall beyond that point; use that list to determine cancellations. For citation databases, they want to see upward trends in use, not necessarily cyclical spikes that average out year-to-year.

Future: Need more turnaway reports from publishers, specifically journal publishers. COUNTER JR5 will give more detail about article requests by year of publication. COUNTER JR1 & BR1 combined report – don’t care about format, just want download data. Need to have download information for full-text subscriptions, not just searches/sessions.

Speaker: Benjamin Heet, librarian

He is speaking about University of Notre Dame’s statistics philosophy. They collect JR1 full text downloads – they’re not into database statistics, mostly because fed search messes them up. Impact factor and Eigen factors are hard to evaluate. He asks, “can you make questionable numbers meaningful by adding even more questionable numbers?”

At first, he was downloading the spreadsheets monthly and making them available on the library website. He started looking for a better way, whether that was to pay someone else to build a tool or do it himself. He went with the DIY route because he wanted to make the numbers more meaningful.

Avoid junk in junk out: HTML vs. PDF downloads depends on the platform setup. Pay attention to outliers to watch for spikes that might indicate unusual use by an individual. The reports often have bad data or duplicate data on the same report.

CORAL Usage Statistics – local program gives them a central location to store user names & passwords. He downloads reports quarterly now, and the public interface allows other librarians to view the stats in readable reports.

Speaker: Justin Clarke, vendor

Harvesting reports takes a lot of time and requires some administrative costs. SUSHI is a vehicle for automating the transfer of statistics from one source to another. However, you still need to look at the data. Your subscription agent has a lot more data about the resources than just use, and can combine the two together to create a broader picture of the resource use.

Harrassowitz starts with acquisitions data and matches the use statistics to that. They also capture things like publisher changes and title changes. Cost per use is not as easy as simple division – packages confuse the matter.

High use could be the result of class assignments or hackers/hoarders. Low use might be for political purchases or new department support. You need a reference point of cost. Pricing from publishers seems to have no rhyme or reason, and your price is not necessarily the list price. Multi-year analysis and subject-based analysis look at local trends.

Rather than usage statistics, we need useful statistics.

ER&L 2010: Opening Keynote – Librarians in the Wild: Thinking About Security, Privacy, and Digital Information

Speaker: Lance Hayden, Assistant Instructor, School of Information – University of Texas

He spent six years with the CIA, after that he attended the UT iSchool, which was followed by working with Cisco Systems on computer security issues. The team he works with does “ethical hacking” – companies hire them to break into their systems to find the holes that need to be filled so that the real bad guys can’t get in.

Many of us are not scared enough. We do things online that we wouldn’t do in the real world. We should be more aware of our digital surroundings and security.

In computer security, “the wild” refers to things that happen in the real world (as opposed to the lab). In cyberspace, the wild and civilization are not separate – the are co-located. Civilization is confidentiality, integrity, and availability. We think that our online communities are entirely civilized, but we are too trusting.

The point is, if you’re not careful about keeping your virtual houses secure, then you’re leaving yourself open to anyone coming in through the windows or the basement door you never lock.

Large herds attract big predators. As more people are connected to a network or virtual house, the motivation to hit it goes up. Part of why Macs seem more secure than Windows machines is because there is a higher ROI for attacking Windows due to the higher number of users. Hacking has gone from kids leaving graffiti to organized crime exploiting users.

Structures decay quickly. The online houses we have built out of software that lives on real-world machines. There are people every day finding vulnerabilities they can exploit. Sometimes they tell the manufacturers/vendors, sometimes they don’t. We keep adding more things to the infrastructure that increases the possibility of exposing more. The software or systems that we use are not monolithic entities – they are constructed with millions of lines of code. Trying to find the mistake in the line of code is like trying to find a misplaced semicolon in War and Peace. It’s more complex than “XYZ program has a problem.”

Protective spells can backfire. Your protective programs and security systems need to be kept up to date or they can backfire. Make sure that your magic is tight. Online shopping isn’t any less safe, because the vulnerabilities are more about what the vendor has in their system (which can be hacked) than about the connection. Your physical vendor has the same information, often on computer systems that can be hacked.

Knowledge is the best survival trait (or, ignorance can get you eaten). Passwords have been the bane of security professionals since the invention of the computer. When every single person in an institution has a password that is a variation on a template, it’s easy to hack. [side note: The Help Desk manager at MPOW recommends using a personalized template and just increasing the number at the end every time they have the required password change. D’oh!] The nature of passwords is that you can’t pick one that is completely secure. What you’re trying to do is to have secure enough of a password to dissuade most people except the most persistent. Hayden suggests phrases and then replace characters with numbers, and make it longer because it increases the number of possible characters required to hack it.

Zuckerberg says that people don’t care about privacy anymore, so don’t blame Facebook, but to a certain extent, Facebook is responsible for changing those norms. Do companies like Google have any responsibility to protect your information? Hayden’s students think that because Google gives them things for free, they don’t care about the privacy of their information and in fact expect that Google will use it for whatever they want.

Samantha Brennan on I’ve been published!November 30, 2020
What a fascinating sport. We'd love to have you back anytime! Welcome!
FY19 conferences, an update – eclectic librarian on FY19 conferencesJanuary 4, 2019
[…] was very excited to finally have approval to attend the Timberline Acquisitions Institute this year, but turns out […]
quantified self, an addendum – eclectic librarian on the quantified selfMarch 27, 2018
[…] I shared a list of apps and tools I’m using to monitor and track things, mainly health-related. Well, my…