Moving Up to the Cloud, a panel lecture hosted by the VCU Libraries

“Sky symphony” by Kevin Dooley

“Educational Utility Computing: Perspectives on .edu and the Cloud”
Mark Ryland, Chief Solutions Architect at Amazon Web Services

AWS has been a part of revolutionizing the start-up industries (i.e. Instagram, Pinterest) because they don’t have the cost of building server infrastructures in-house. Cloud computing in the AWS sense is utility computing — pay for what you use, easy to scale up and down, and local control of how your products work. In the traditional world, you have to pay for the capacity to meet your peak demand, but in the cloud computing world, you can level up and down based on what is needed at that moment.

Economies, efficiencies of scale in many ways. Some obvious: storage, computing, and networking equipment supply change; internet connectivity and electric power; and data center sitting, redundancy, etc. Less obvious: security and compliance best practices; datacenter internal innovations in networking, power, etc.

AWS and .EDU: EdX, Coursera, Texas Digital Library, Berkeley AMP Lab, Harvard Medical, University of Phoenix, and an increasing number of university/school public-facing websites.

Expects that we are heading toward cloud computing utilities to function much like the electric grid — just plug in and use it.


“Libraries in Transition”
Marshall Breeding, library systems expert

We’ve already seen the shift of print to electronic in academic journals, and we’re heading that way with books. Our users are changing in the way they expect interactions with libraries to be, and the library as space is evolving to meet that, along with library systems.

Web-based computing is better than client/server computing. We expect social computing to be integrated into the core infrastructure of a service, rather than add-ons and afterthoughts. Systems need to be flexible for all kinds of devices, not just particular types of desktops. Metadata needs to evolve from record-by-record creation to bulk management wherever possible. MARC is going to die, and die soon.

How are we going to help our researchers manage data? We need the infrastructure to help us with that as well. Semantic web — what systems will support it?

Cooperation and consolidation of library consortia; state-wide implementations of SaaS library systems. Our current legacy ILS are holding libraries back from being able to move forward and provide the services our users want and need.

A true cloud computing system comes with web-based interfaces, externally hosted, subscription OR utility pricing, highly abstracted computing model, provisioned on demand, scaled according to variable needs, elastic.


“Moving Up to the Cloud”
Mark Triest, President of Ex Libris North America

Currently, libraries are working with several different systems (ILS, ERMS, DRs, etc.), duplicating data and workflows, and not always very accurately or efficiently, but it was the only solution for handling different kinds of data and needs. Ex Libris started in 2007 to change this, beginning with conversations with librarians. Their solution is a single system with unified data and workflows.

They are working to lower the total cost of ownership by reducing IT needs, minimize administration time, and add new services to increase productivity. Right now there are 120+ institutions world-wide who are in the process of or have gone live with Alma.

Automated workflows allow staff to focus on the exceptions and reduce the steps involved.

Descriptive analytics are built into the system, with plans for predictive analytics to be incorporated in the future.

Future: collaborative collection development tools, like joint licensing and consortial ebook programs; infrastructure for ad-hoc collaboration


“Cloud Computing and Academic Libraries: Promise and Risk”
John Ulmschneider, Dean of Libraries at VCU

When they first looked at Alma, they had two motivations and two concerns. They were not planning or thinking about it until they were approached to join the early adopters. All academic libraries today are seeking to discover and exploit new efficiencies. The growth of cloud-resident systems and data requires academic libraries to reinvigorate their focus on core mission. Cloud-resident systems are creating massive change throughout out institutions. Managing and exploiting pervasive change is a serious challenge. Also, we need to deal with security and durability of data.

Cloud solutions shift resources from supporting infrastructure to supporting innovation.

Efficiencies are not just nice things, they are absolutely necessary for academic libraries. We are obligated to upend long-held practice, if in doing so we gain assets for practice essential to our mission. We must focus recovered assets on the core library mission.

Agility is the new stability.

Libraries must push technology forward in areas that advance their core mission. Infuse technology evolution for libraries with the values needs of libraries. Libraries must invest assets as developers, development partners, and early adopters. Insist on discovery and management tools that are agnostic regarding data sources.

Managing the change process is daunting.. but we’re already well down the road. It’s not entirely new, but it does involve a change in culture to create a pervasive institutional agility for all staff.

NASIG 2012: Managing E-Publishing — Perfect Harmony for Serialists

Presenters: Char Simser (Kansas State University) & Wendy Robertson (University of Iowa)

Iowa looks at e-publishing as an extension of the central mission of the library. This covers not only text, but also multimedia content. After many years of ad-hoc work, they formed a department to be more comprehensive and intentional.

Kansas really didn’t do much with this until they had a strategic plan that included establishing an open access press (New Prairie). This also involved reorganizing personnel to create a new department to manage the process, which includes the institutional depository. The press includes not only their own publications, but also hosts publications from a few other sources.

Iowa went with BEPress’ Digital Commons to provide both the repository and the journal hosting. Part of why they went this route for their journals was because they already had it for their repository, and they approach it more as being a hosting platform than as being a press/publisher. This means they did not need to add staff to support it, although they did add responsibilities to exiting staff in addition to their other work.

Kansas is using Open Journal Systems hosted on a commercial server due to internal politics that prevented it from being hosted on the university server. All of their publications are Gold OA, and the university/library is paying all of the costs (~$1700/year, not including the .6 FTE staff hours).

Day in the life of New Prairie Press — most of the routine stuff at Kansas involves processing DOI information for articles and works-cited, and working with DOAJ for article metadata. The rest is less routine, usually involving journal setups, training, consultation, meetings, documentation, troubleshooting, etc.

The admin back-end of OJS allows Char to view it as if she is different types of users (editor, author, etc.) to be able to trouble-shoot issues for users. Rather than maintaining a test site, they have a “hidden” journal on the live site that they use to test functions.

A big part of her daily work is submitting DOIs to CrossRef and going through the backfile of previously published content to identify and add DOIs to the works-cited. The process is very manual, and the error rate is high enough that automation would be challenging.

Iowa does have some subscription-based titles, so part of the management involves keeping up with a subscriber list and IP addresses. All of the titles eventually fall into open access.

Most of the work at Iowa has been with retrospective content — taking past print publications and digitizing them. They are also concerned with making sure the content follows current standards that are used by both library systems and Google Scholar.

There is more. I couldn’t take notes and keep time towards the end.

ER&L 2010: ERMS Success – Harvard’s experience implementing and using an ERM system

Speaker: Abigail Bordeaux

Harvard has over 70 libraries and they are very decentralized. This implementation is for the central office that provides the library systems services for all of the libraries. Ex Libris is their primary vendor for library systems, include the ERMS, Verde. They try to go with vended products and only develop in-house solutions if nothing else is available.

Success was defined as migrating data from old system to new, to improve workflows with improved efficiency, more transparency for users, and working around any problems they encountered. They did not expect to have an ideal system – there were bugs with both the system and their local data. There is no magic bullet. They identified the high-priority areas and worked towards their goals.

Phase I involved a lot of project planning with clearly defined goals/tasks and assessment of the results. The team included the primary users of the system, the project manager (Bordeaux), and a programmer. A key part of planning includes scoping the project (Bordeaux provided a handout of the questions they considered in this process). They had a very detailed project plan using Microsoft Project, and at the very least, the listing out of the details made the interdependencies more clear.

The next stage of the project involved data review and clean-up. Bordeaux thinks that data clean-up is essential for any ERM implementation or migration. They also had to think about the ways the old ERM was used and if that is desirable for the new system.

The local system they created was very close to the DLF recommended fields, but even so, they still had several failed attempts to map the fields between the two systems. As a result, they had a cycle of extracting a small set of records, loading them into Verde, reviewing the data, and then delete the test records out of Verde. They did this several times with small data sets (10 or so), and when they were comfortable with that, they increased the number of records.

They also did a lot of manual data entry. They were able to transfer a lot, but they couldn’t do everything. And some bits of data were not migrated because of the work involved compared to the value of it. In some cases, though, they did want to keep the data, so they entered it manually. Part of what they did to visualize the mapping process, they created screenshots with notes that showed the field connections.

Prior to this project, they were not using Aleph to manage acquisitions. So, they created order records for the resources they wanted to track. The acquisitions workflow had to be reorganized from the ground up. Oddly enough, by having everything paid out of one system, the individual libraries have much more flexibility in spending and reporting. However, it took some public relations work to get the libraries to see the benefits.

As a result of looking at the data in this project, they got a better idea of gaps and other projects regarding their resources.

Phase two began this past fall to begin incorporating the data from the libraries that did not participate in phase one. They now have a small group with folks representing the libraries. This group is coming up with best practices for license agreements and entering data into the fields.

NASIG 2008: Next Generation Library Automation – Its Impact on the Serials Community

Speaker: Marshall Breeding

Check & update your library’s record on lib-web-cats — Breeding uses this data to track the ILS and ERMS systems used by libraries world-wide.

The automation industry is consolidating, with several library products dropped or ceased to be supported. External financial investors are increasingly controlling the direction of the industry. And, the OPAC sucks. Libraries and users are continually frustrated with the products they are forced to use and are turning to open source solutions.

The innovation presented by automation companies falls below the expectations of libraries (not so sure about users). Conventional ILS need to be updated to incorporate the modern blend of digital and print collections.

We need to be more thoughtful in our incorporation of social tools into traditional library systems and infrastructures. Integrate those Web 2.0 tools into existing delivery options. The next NextGen automation tools should have collaborative features built into them.

Open source software isn’t free — it’s just a different model (pay for maintenance and setup v. pay for software). We need more robust open source software for libraries. Alternatively, systems need to open up so that data can be moved in and out easily. Systems need APIs that allow local coders to enhance systems to meet the needs of local users. Open source ERMS knowledge bases haven’t been seriously developed, although there is a need.

The drive towards open source solutions has often been motivated by disillusionment with current vendors. However, we need to be cautious, since open source isn’t necessarily the golden key that will unlock the door to paradise. (i.e. Koha still needs to add serials and acquisitions modules, as well as EDI capabilities).

The open source movement motivates the vendors to make their systems more open for us. This is a good thing. In the end, we’ll have a better set of options.

Open Source ILS options: Koha (commercial support from LibLime) used mostly by small to medium libraries, Evergreen (commercial support from Equinox Software) tested and proven for small to medium libraries in a consortia setting, and OPALS (commercial support from Media Flex) used mostly by k-12 schools.

In making the case for open source ILS, you need to compare the total cost of ownership, the features and functionality, and the technology platform and conceptual models. Are they next-generation systems or open source versions of legacy models?

Evaluate your RFPs for new systems. Are you asking for the things you really need or are you stuck in a rut of requiring technology that was developed in the 70s and may no longer be relevant?

Current open source ILS products lack serials and acquisitions modules. The initial wave of open source ILS commitments happened in the public library arena, but the recent activity has been in academic libraries (WALDO consortia going from Voyager to Koha, University of Prince Edward Island going from Unicorn to Evergreen in about a month). Do the current open source ILS products provide a new model of automation, or an open source version of what we already have?

Looking forward to the day when there is a standard XML for all ILS that will allow libraries to manipulated their data in any way they need to.

We are working towards a new model of library automation where monolithic legacy architectures are replaced by the fabric of service oriented architecture applications with comprehensive management.

The traditional ILS is diminishing in importance in libraries. Electronic content management is being done outside of core ILS functions. Library systems are becoming less integrated because the traditional ILS isn’t keeping up with our needs, so we find work-around products. Non-integrated automation is not sustainable.

ERMS — isn’t this what the acquisitions module is supposed to do? Instead of enhancing that to incorporate the needs of electronic resources, we had to get another module or work-around that may or may not be integrated with the rest of the ILS.

We are moving beyond metadata searching to searching the actual items themselves. Users want to be able to search across all products and packages. NextGen federated searching will harvest and index subscribed content so that it can be searched and retrieved more quickly and seamlessly.

Opportunities for serials specialists:

  • Be aware of the current trends
  • Be prepared for accelerated change cycles
  • Help build systems based on modern business process automation principles. What is your ideal serials system?
  • Provide input
  • Ensure that new systems provide better support than legacy systems
  • Help drive current vendors towards open systems

How will we deliver serials content through discovery layers?

Reference:

  • “It’s Time to Break the Mold of the Original ILS,” Computers in Libraries, Nov/Dec 2007.

CiL 2008: Catalog Effectiveness

Speaker: Rebekah Kilzer

The Ohio State University Libraries have used Google Analytics for assessing the use of the OPAC. It’s free for sites up to five million page views per month — OSU has 1-2 million page views per month. Libraries would want to use this because most integrated library systems offer little in the way of use statistics, and what they do have isn’t very… useful. You will need to add some code that will display on all OPAC pages.

Getting details about how users interact with your catalog can help with making decisions about enhancements. For example, knowing how many dial-up users interact with the site could determine whether or not you want to develop style sheets specifically for them, for example. You can also track what links are being followed, which can contribute to discussions on link real estate.

There are several libraries that are mashing up Google Analytics information with other Google tools.


Speakers: Cathy Weng and Jia Mi

The OPAC is a data-centered, card-catalog retrieval system that is good for finding known items, but not so good as an information discovery tool. It’s designed for librarians, not users. Librarian’s perceptions of users (forgetful, impatient) prevents librarians from recognizing changes in user behavior and ineffective OPAC design.

In order to see how current academic libraries represent and use OPAC systems, they studied 123 ARL libraries’ public interfaces and search capabilities as well as their bibliographic displays. In the search options, two-thirds of libraries use “keyword” as the default and the other third use “title.” The study also looked at whether or not the keyword search was a true keyword search with an automatic “and” or if the search was treated as a phrase. Few libraries used relevancy ranking as the default search results sorting.

There are some great disparities in OPAC quality. Search terms and search boxes are not retained on the results page, post-search limit functions are not always readily available, item status are not available on search results page, and the search keywords are not highlighted. These are things that the most popular non-library search engines do, which is what our users expect the library OPAC to do.

Display labels are MARC mapping, not intuitive. Some labels are suitable for certain types of materials but not all (proper name labels for items that are “authored” by conferences). They are potentially confusing (LCSH & MeSH) and occasionally inaccurate. The study found that there were varying levels of effort put to making the labels more user-friendly and not full of library jargon.

In addition to label displays, OPACs also suffer from the way the records are displayed. The order of bibliographic elements effect how users find relevant information to determine whether or not the item found is what they need.

There are three factors that contribute to the problem of the OPAC: system limitations, libraries not exploiting full functionality of ILS, and MARC standards are not well suited to online bibliographic display. We want a system that doesn’t need to be taught, that trusts users as co-developers, and we want to maximize and creatively utilize the system’s functionality.

The presentation gave great examples of why the OPAC sucks, but few concrete examples of solutions beyond the lipstick-on-a-pig catalog overlay products. I would have liked to have a list of suggestions for label names, record display, etc., since we were given examples of what doesn’t work or is confusing.

CiL 2008: Woepac to Wowpac

Moderator: Karen G. Schneider – “You’re going to go un-suck your OPACs, right?”


Speaker: Roy Tennant

Tennant spent the last ten years trying to kill off the term OPAC.

The ILS is your back end system, which is different from the discovery system (doesn’t replace the ILS). Both of these systems can be locally configured or hosted elsewhere. Worldcat Local is a particular kind of discovery system that Tenant will talk about if he has time.

Traditionally, users would search the ILS to locate items, but now the discovery system will search the ILS and other sources and present it to the user in a less “card catalog” way. Things to consider: Do you want to replace your ILS or just your public interface? Can you consider open source options (Koha, Evergreen, vuFind, LibraryFind etc.)? Do you have the technical expertise to set it up and maintain it? Are you willing to regularly harvest data from your catalog to power a separate user interface?


Speaker: Kate Sheehan

Speaking from her experience of being at the first library to implement LibraryThing for Libraries.

The OPAC sucks, so we look for something else, like LibraryThing. The users of LibraryThing want to be catalogers, which Sheehan finds amusing (and so did the audience) because so few librarians want to be catalogers. “It’s a bunch of really excited curators.”

LibraryThing for libraries takes the information available in LibraryThing (images, tags, etc.) and drops them into the OPAC (platform independent). The display includes other editions of books owned by the library, recommendations based on what people actually read, and a tag cloud. The tag cloud links to a tag browser that opens up on top of the catalog and allows users to explore other resources in the catalog based on natural language tags rather than just subject headings. Using a Greasmonkey script in your browser, you can also incorporate user reviews pulled from LibraryThing. Statistics show that the library is averaging around 30 tag clicks and 18 recommendations per day, which is pretty good for a library that size.

“Arson is fantastic. It keeps your libraries fresh.” — Sheehan joking about an unusual form of collection weeding (Danbury was burnt to the ground a few years ago)

Data doesn’t grow on trees. Getting a bunch of useful information dropped into the catalog saves staff time and energy. LibraryThing for Libraries didn’t ask for a lot from patrons, and it gave them a lot in return.


Speaker: Cindi Trainor

Are we there yet? No. We can buy products or use open source programs, but they still are not the solution.

Today’s websites are consist of content, community (interaction with other users), interactivity (single user customization), and interoperability (mashups). RSS feeds are the intersection of interactivity and content. There are a few websites that are in the sweet spot in the middle of all of these: Amazon (26/32)*, Flickr (26/32), Pandora (20/32), and Wikipedia (21/32) are a few examples.

Where are the next generation catalog enhancements? Each product has a varying degree of each element. Using a scoring system with 8 points for each of the four elements, these products were ranked: Encore (10/32), LibraryFind (12/32), Scriblio (14/32), and WorldCat Local (16/32). Trainor looked at whether the content lived in the system or elsewhere and the degree to which it pulled information from sources not in the catalog. Library products still have a long way to go – Voyager scored a 2/32.

*Trainor’s scoring system as described in paragraph three.


Speaker: John Blyberg

When we talk about OPACs, we tend to fetishize them. In theory, it’s not hard to create a Wowpac. The difficulty is in creating the system that lives behind it. We have lost touch with the ability to empower ourselves to fix the problems we have with integrated library systems and our online public access catalogs.

The OPAC is a reflection of the health of the system. The OPAC should be spilling out onto our website and beyond, mashing it up with other sites. The only way that can happen is with a rich API, which we don’t have.

The title of systems librarian is becoming redundant because we all have a responsibility and role in maintaining the health of library systems. In today’s information ecology, there is no destination — we’re online experiencing information everywhere.

There is no way to predict how the information ecology will change, so we need systems that will be flexible and can grow and change over time. (Sopac 2.0 will be released later this year for libraries who want to do something different with their OPACs.) Containers will fail. Containers are temporary. We cannot hang our hat on one specific format — we need systems that permit portability of data.

Nobody in libraries talks about “the enterprise” like they do in the corporate world. Design and development of the enterprise cannot be done by a committee, unless they are simply advisors.

The 21st century library remains un-designed – so let’s get going on it.

css.php