June 2009 – eclectic librarian

#11

I’m Off Then: Losing and Finding Myself on the Camino de Santiago by Hape Kerkeling

I’ll bet you thought I forgot about this whole 50 books thing. No, it’s just that once again, my intentions are much more noble than reality. I have also been rather poor at reporting on the books I’ve read this year, but most of the time, I assume I’m the only one who really cares about all this, anyway.

#11 is I’m Off Then: Losing and Finding Myself on the Camino de Santiago by Hape Kerkeling, translated by Shelley Frisch. This one landed on my doorstep the other week as the latest in a slow trickle of review books coming in from Library Journal. (You can search for my recent reviews, if you’re so inclined.) A little uncertain about it at first, I quickly found myself lost in the story and read it cover to cover in one sitting.

Kerkeling is a German comedy performer of some renown. Not being up on my European comedians (aside from nearly memorizing all of Eddie Izzard’s routines on YouTube), I hadn’t heard of the fellow before this book. After failing to track down any recording of a performance in English or with subtitles, I gave up. Considering that my German linguistic skills are virtually nil, I’m not surprised I hadn’t heard of him before. (If you are interested, Amazon has a short interview with him in English.)

The book is essentially the diary he wrote while hiking the Camino de Santiago in 2001. It’s not strictly a recording of events and people from the pilgrimage, but the stories he tells about his background and prior experiences add import to the things that happen to him on the trail. By the end of the story, I felt as though Kerkeling was a long-lost friend with whom I had recently reunited over a cup of coffee. In many ways, this book reminded me of Kelly Winters’ Walking Home, and that is a good thing.

Pandora Town Hall (Richmond, VA)

Open question/answer forum with Tim Westergren, the founder of the Music Genome Project and Pandora Internet Radio.

June 29, 2009
approx 100 attending
free t-shirts! free burritos from Chipotle!

Tim Westergren, founder of Pandora

His original plan was to get in a car & drive across country to find local music to add to Pandora, but it wasn’t quite as romantic as he thought it would be. On the way home, he planned a meetup on the fly using the Pandora blog, and since then, whenever he visits a new city, he organizes get together like this one.

Tim is a Stanford graduate and a musician, although he didn’t study it specifically. He spent most of his 20s playing in bands, touring around the country, but not necessarily as a huge commercial success. It’s hard to get on the radio, and radio is the key to professional longevity. Eventually, he shifted to film score composition, which required him to analyze music and break it down into components that represent what is happening on the screen. This generated the idea of a musical genome.

The Music Genome Project was launched in 2000 with some seed money that lasted about a year. Eventually, they ran out of money and couldn’t pay their 45 employees. They tried several different ways to raise money, but nothing worked until some venture investors put money into it in 2004. At that point, they took the genome and repurposed it into a radio (Pandora) in 2005.

They have never advertised — it has all been word of mouth. They now add about 65,000 new listeners per day! They can see profitability on the horizon. Pandora is mainly advertising supported. The Amazon commissions provide a little income, but not as much as you might think they would.

There are about 75,000 artists on the site, and about 70% are not on a major label. The song selection is not based on popularity, like most radio, but rather on the elements of the songs and how they relate to what the user has selected.

Playlists are initially created by the song or artists musical proximity to begin with, and then is refined as the user thumbs up or down songs. Your thumbs up and down effect only the station you are listening to, and it effects whatever the rest of the playlist was going to be. They use the over-all audience feedback to adjust across the site, but it’s not as immediate or personalized.

They have had some trouble with royalties. They pay both publishing and performer royalties per song. They operate under the DMCA, including the royalty structure. Every five years, a committee determines what the rate will be for the next year. In July 2007, the committee decided to triple the ratings and made it retroactive. It essentially bankrupted the company.

Pandora called upon the listeners to help them by contacting their congressional representative to voice opposition to the decision. Congress received 400,000 faxes in three days, breaking the structure on the Hill for a week! Their phones were ringing all day long! Eventually, they contacted Pandora to make it stop. They are now finishing up what needs to be done to bring the royalty back to something more reasonable. (Virtually all the staffers on Capitol Hill are Pandora users — made it easy to get appointments with congress members.)

Music comes to Pandora from a variety of sources. They get a pile of physical and virtual submissions from artists. They also pay attention to searches that don’t result in anything in their catalog, as well as explicit suggestions from listeners.

They have a plan to offer musicians incentives to participate. For example, if someone thumbs up something, there would be a pop-up that suggest checking out a similar (or the same) band that is playing locally. Most of the room would opt into emails that let them know when bands they like are coming to town. Musicians could see what songs are being thumbed up or down and where the listeners are located.

Listener suggestion: on the similar artists pages, provide more immediate sampling of recommendations.

What is the cataloging backlog? It takes about 8-10 weeks, and only about 30% of what is submitted makes it in. They select based on quality: for what a song is trying to do, does it do it well? They know when they’ve made a wrong decision if they don’t include something and a bunch of people search for it.

Pandora is not legal outside of the US, but many international users fake US zip codes. However, in order to avoid lawsuits, they started blocking by IP. As soon as they implemented IP blocking, they received a flood of messages, including one from a town that would have “Pandora night” at a local club. (The Department of Defense called up and asked them to block military IP ranges because Pandora was hogging the bandwidth!)

Why are some songs quieter than others? Tell them. They should be correcting for that.

The music genome is used by a lot of scorers and concert promoters to find artists and songs that are similar to the ones they want.

Could the users be allowed more granular ratings rather than thumbing up or down whole songs? About a third of the room would be interested in that.

Mobile device users are seeing fewer advertisements, and one listener is concerned that this will impact revenue. Between the iPhone, the Blackberry, and the Palm Pre, they have about 45,000 listeners on mobile devices. This is important to them, because these devices will be how Pandora will get into listener’s cars. And, in actuality, mobile listeners interact with advertisements four times as much as web listeners.

Tim thinks that eventually Pandora will host local radio. I’m not so sure how that would work.

Subscription Pandora is 192kbps, which sounds pretty good (and it comes with a desktop application). It’s not likely to get to audiophile level until the pipes are big enough to handle the bandwidth.

Variety and repetition is their biggest areas where they get feedback from listeners. The best way to get variety is to add different artists. If you thumb down an artist three times, they should be removed from the station.

They stream about 1/3 of the data that YouTube streams daily, with around 100 servers. Tim is not intimately familiar with the tech that goes into make Pandora work.

[The questions kept coming, but I couldn’t stay any longer, unfortunately. If you have a chance to attend a Pandora Town Hall, do it!]

Kindle 2 is kind of cool, actually

I’m not going to gush about how I fell in love with the device, because I didn’t.

My library (as in, the library where I work) has the good fortune of being blessed with both funds and leadership that allow us to experiment with some emerging technologies. When Amazon released the first version of the Kindle, we purchased one to experiment with. It was simply the latest in a long history of ebook readers that we had hoped to be able to incorporate into the library’s function on campus.

I took a turn at using the Kindle, and I was mightily unimpressed. The interface seemed very clunky, to the point of preventing me from getting into the book I tried to read. When the Kindle 2 was released and we received permission to purchase one, I was skeptical that it would be any better, but I still signed up for my turn at using it.

Last week, I was given the Kindle 2, and since it already had a book on it that I was half-way through reading, I figured I would start there. However, I was not highly motivated to make the time for it. Yesterday afternoon, I took the train up to DC, returning this morning. Four hours round trip, plus the extra time spent waiting at each station, gave me plenty of time to finish my book, so I brought the Kindle 2 with me.

I’m not going to gush about how I fell in love with the device, because I didn’t. However, I finished the book with ease before I arrived in DC, and out of shear boredom I pulled down a copy of another book that was already purchased on our library account. I was pleasantly surprised by how easy it was to go from one book to another without having to lug along several selections from my library “just in case” I ran out of something to read.

Right now, I’m at least a third of the way in on the second book, and I plan to finish reading it on the Kindle 2.

I don’t think I’ll end up buying one anytime soon, particularly since I’ve put a stop to buying new books until I’ve read more of the ones I own. However, I have a better understanding of those Kindle enthusiasts who rave about having their entire library (and more) at their fingertips. It’s pretty handy if you’re someone who often has time to kill away from your library.

where I spend my time online

While I was at the reference desk this quiet afternoon, I attempted to catch up on scanning through Lifehacker. Their article about the Geek Chart app caught my eye. Microblogging, or at the very least, in the moment stream of consciousness sharing, has taken over a good portion of my online presence, leaving this venue for slightly more substantial (and infrequent) commentary. So, I decided to fill out the details needed to build my Geek Chart.

Anna’s Geek Chart

Looks like those of you who want a more regular dose of Anna will need to be following my Twitter and Flickr feeds (with some Delicious thrown in). For the rest of you, enjoy the lighter load on your feed reader.

NASIG 2009: What Color Is Your Paratext?

Presenter: Geoffrey Bilder, CrossRef

The title is in reference to a book that is geared towards preparing for looking for a new job or changing careers, which is relevant to what the serials world is facing, both personnel and content. Paratext is added content that prepares the audience/reader for the meat of the document. We are very good at controlling and evaluating credibility, which is important with conveying information via paratext.

The internet is fraught with false information, which undermines credibility. The publisher’s value is being questioned because so much of their work can be done online at little or no cost, and what can’t be done cheaply is being questioned. Branding is increasingly being hidden by layers like Google which provide content without indicating the source. The librarian’s problem is similar to the publisher’s. Our value is being questioned when the digital world is capable of managing some of our work through distributed organizational structures.

“Internet Trust Anti-Pattern” — a system starts out as being a self-selected core of users with an understanding of trust, but as it grows, that can break down unless there is a structure or pervasive culture that maintains the trust and authority.

Local trust is that which is achieved through personal acquaintance and is sometimes transitive. Global trust extends through proxy, which transitively extends trust to “strangers.” Local is limited and hard to expand, and global increases systemic risk.

Horizontal trust occurs among equals with little possibility of coercion. Vertical trust occurs within a hierarchy, and coercion can be used to enforce behavior, which could lead to abuse.

Internet trust is in the local and horizontal quadrant. Scholarly trust falls in the vertical and global quadrant. It’s no wonder we’re having trouble figuring out how to do scholarship online!

Researchers have more to read and less time to read it, and it’s increasing rapidly. We need to remember that authors and readers are the same people. The amazing ways that technology has opened up communication is also causing the overload. We need something to help identify credible information.

Dorothea Salo wrote that for people who put a lot of credibility in authoritative information, we don’t do a very good job of identifying it. She blames librarians, but publishers have a responsibility, too. Heuristics are important in knowing who the intended audience is meant to be.

If you find a book at a bargain store, the implication is that it is going to be substantially less authoritative than a book from a grand, old library. (There are commercial entities selling leather bound books by the yard for buyers to use to add gravitas to their offices and personal libraries.) Scholarly journals are dull and magazines are flashy & bright. Books are traditionally organized with all sorts of content that tells academics whether or not they need to read them (table of contents, index, blurbs, preface, bibliography, etc.).

If you were to black out the text of a scholarly document, you would still be able to identify the parts displayed. You can’t do that very well with a webpage.

When we evaluate online content, we look at things like the structure of the URL and where it is linked from. In the print world, citations and footnotes were essential clues to following conversations between scholars. Linking can do that now, but the convention is still more formal. Logos can also tell us whether or not to put trust in content.

Back in the day, authors were linked to printers, but that lead to credibility problems, so publishers stepped in. Authors and readers could trust that the content was accurate and properly presented. Now it’s not just publishers — titles have become brands. A journal reputation is almost more important than who is publishing it.

How do we help people learn and understand the heuristics in identifying scholarly information? The processes for putting out credible information is partially hidden — the reader or librarian doesn’t know or see the steps involved. We used to not want to know, but now we do, particularly since it allows us to differentiate between the good players and the bad players.

The idea of the final version of a document needs to be buried. Even in the print world (with errata and addenda) we were deluding ourselves in thinking that any document was truly finished.

Why don’t we have a peer reviewed logo? Why don’t we have something that assures the reader that the document is credible? Peer review isn’t necessarily perfect or the only way.

How about a Version of Record record? Show us what was done to a document to get it to where it is now. For example, look at Creative Commons. They have a logo that indicates something about the process of creating the document which leads to machine-readable coding. How about a CrossMark that indicates what a publisher has done with a document, much like what a CC logo will lead to?

Knowmore.org created a Firefox plugin to monitor content and provides icons that flags companies and websites for different reasons. Oncode is a way of identifying organizations that have signed a code of conduct. We could do this for scholarly content.

Tim Berners Lee is actively advocating for ways to overlay trust measures on the internet. It was originally designed by academics who didn’t need it, but like the internet anti-trust pattern, the “unwashed masses” have corrupted that trust.

What can librarians and publishers do to recreate the heuristics that have been effective in print? We are still making facsimiles of print in electronic format. How are we going to create the tools that will help people evaluate digital information?

NASIG 2009: Registration Ruminations

Presenters: Kristina Krusmark and Mary Throumoulos

More than 60% of all content purchased has an electronic component. This is continually increasing, requiring more things that need to be registered.

Last summer, Ebsco commissioned a study to identify challenges in online content purchases. About 455 participants, mostly from North America, and they identified registration and activation as the primary issue. The survey found that the process is too complicated. There isn’t a standard model, and often the instructions/information are incomplete. Another challenge the survey found was with a lack of sufficient staffing to properly manage the process. This results in delays in access or titles not being registered at all.

If users don’t have access to content, then they won’t use the content, even if it had been paid for. When librarians look at usage to make collection development decisions, the lack or delay in activation could have a huge impact on whether or not to retain the subscription. And, as one audience member noted, after having bad or frustrating experiences with registering for access, librarians might be hesitant to subscribe to online journals that are difficult to “turn on.”

Recently, Throumoulos’s library decided to convert as much as possible to online-only. They canceled print journals that were also available through aggregators like Project Muse, and made decisions about whether to retain print-only titles. Then they began the long process of activating those online subscriptions.

For online-only, most of the time the license process results in access without registration. For print+online titles, the registration process can be more complicated, and sometimes involving information from mailing labels, which may or may not be retained in processing.

Agents would like to be able to register on behalf of libraries, and most do so when they are able to. However, many publishers want the customer, not the agent, to register access. When agents can’t register for the customer, they do try to provide as much information about the process (links, instructions, customer numbers, basic license terms, etc.).

Opportunities for improvement: standardization of registration models, greater efficiencies between agents and publishers, and industry initiatives like SERU.

NASIG 2009: Informing Licensing Stakeholders

Towards a More Effective Negotiation

Presenters: Lisa Sibert, Micheline Westfall, Selden Lamoreux, Clint Chamberlain (moderator), Vida Damijonaitis, and Brett Rubinstein

Licensing as a process has not been improving very much. Some publishers are willing to negotiate changes, but some are still resistant. It often takes months to a year to receive fully signed licenses from publishers, which can tie up access or institutional processes. Negotiation time is, of course, a factor, but it should not effect the time it takes for both parties to sign and distribute copies once the language is agreed upon. One panelist noted that larger publishers are often less willing to negotiate than smaller ones. Damijonaitis stated that licenses are touched at fourteen different points in the process on their end, which plays into the length of time.

Publishers are concerned with the way the content is being used and making sure that it is not abused (without consequences). Is it necessary to put copyright violation language in licenses or can it live on purchase orders? Springer has not had any copyright violations that needed to be enforced in the past five or six years. They work with their customers to solve any problems as they come up, and libraries have been quick to deal with the situation. On the library side, some legal departments are not willing to allow libraries to participate in SERU.

Deal breakers: not allowing walk-ins, adjunct faculty, interlibrary loan, governing law, and basic fair use provisions. Usage statistics and uptime guarantees are important and sometimes difficult to negotiate. LibLicense is useful for getting effective language that publishers have agreed to in the past.

It’s not the libraries who tend to be the abusers of license terms or copyright, it’s the users. Libraries are willing to work with publishers, but if the technology has grown to the point where it is too difficult for the library to police use, then some other approach is needed. When we work with publishers that don’t require licenses or use just purchase orders, there is less paperwork, but it also doesn’t indemnify the institution, which is critical in some cases.

Bob Boissy notes that no sales person gets any benefit from long negotiations. They want a sale. They want an invoice. Libraries are interested in getting the content as quickly as possible. I think we all are coming at this with the same desired outcome.

NASIG 2009: ERMS Integration Strategies – Opportunity, Challenge, or Promise?

Speakers: Bob McQuillan (moderator), Karl Maria Fattig, Christine Stamison, and Rebecca Kemp

Many people have an ERM, some are implementing it, but few (in the room) are where they consider to be finished. ERMS present new opportunity and challenges with workflow and staffing, and the presenters intend to provide some insight for those in attendance.

At Fattig’s library, their budget for electronic is increasing as print is decreasing, and they are also running out of space for their physical collections. Their institution’s administration is not supportive of increasing space for materials, so they need to start thinking about how to stall or shrink their physical collection. In addition, they have had reductions in technical services staffing. Sound familiar?

At Kemp’s library, she notes that about 40% of her time is spent on access setup and troubleshooting, which is an indication of how much of their resources is allocated for electronic resources. Is it worth it? They know that many of their online resources are heavily used. Consorital “buying clubs” makes big deals possible, opening up access to more resources than they could afford on their own. Electronic is a good alternative to adding more volumes to already over-loaded shelves.

Stamison (SWETS) notes that they have seen a dramatic shift from print to electronic. At least two-thirds of the subscriptions they handle have an electronic component, and most libraries are going e-only when possible. Libraries tell them that they want their shelf space. Also, many libraries are going direct to publishers for the big deals, with agents getting involved only for EDI invoicing (cutting into the agent’s income). Agents are now investing in new technologies to assist libraries in managing e-collections, including implementing access.

Kemp’s library had a team of three to implement Innovative’s ERM. It took a change in workflow and incorporating additional tasks with existing positions, but everyone pulled through. Like libraries, Stamison notes that agents have had to change their workflow to handle electronic media, including extensive training. And, as libraries have more people working with all formats of serials, agents now have many different contacts within both libraries and publishers.

Fattig’s library also reorganized some positions. The systems librarian, acquisitions librarian, and serials & electronic resources coordinator all work with the ERMS, pulling from the Serials Solutions knowledge base. They have also contracted with someone in Oregon to manage their EZproxy database and WebBridge coverage load. Fattig notes that it takes a village to maintain an ERMS.

Agents with electronic gateway systems are working to become COUNTER compliant, and are heavily involved with developing SUSHI. Some are also providing services to gather those statistics for libraries.

Fattig comments that usage statistics are serials in themselves. At his library, they maintained a homegrown system for collecting usage statistics from 2000-07, then tried Serials Solutions Counter 360 for a year, but now are using an ERM/homegrown hybrid. They created their own script to clean up the files, because as we all know, COUNTER compliance means something different to each publisher. Fattig thinks that database searches are their most important statistics for evaluating platforms. They use their federated search statistics to weigh the statistics from those resources (will be broken out in COUNTER 3 compliance).

Kemp has not been able to import their use stats into ERM. One of their staff members goes in every month to download stats, and the rest come from ScholarlyStats. They are learning to make XML files out of their Excel files and hope to use the cost per use functionality in the future.

Fattig: “We haven’t gotten SUSHI to work in some of the places it’s supposed to.” Todd Carpenter from NISO notes that SUSHI compliance is a requirement of COUNTER 3.

For the next 12-18 months, Fattig expects that they will complete the creation of license and contact records, import all usage data, and implement SUSHI when they can. They will continue to work with their consorital tool, implement a discovery layer, and document everything. Plans to create a “cancellation ray gun and singalong blog” — a tool for taking criteria to generate suggested cancellation reports.

Like Fattig, Kemp plans to finish loading all of the data about license and contacts, also the coverage data. Looking forward to eliminating a legacy spreadsheet. Then, they hope to import COUNTER stats and run cost/use reports.

Agents are working with ONIX-PL to assist libraries in populating their ERMS with license terms. They are also working with CORE to assist libraries with populating acquisitions data. Stamison notes that agents are working to continue to be liaisons between publishers, libraries, and system vendors.

Dan Tonkery notes that he’s been listening to these conversations for years. No one is serving libraries very well. Libraries are working harder to get these things implemented, while also maintaining legacy systems and workarounds. “It’s too much work for something that should be simple.” Char Simser notes that we need to convince our administrations to move more staff into managing eresources as our budgets are shifting more towards them.

Another audience member notes that his main frustration is the lack of cooperation between vendors/products. We need a shared knowledge base like we have a shared repository for our catalog records. This gets tricky with different package holdings and license terms.

Audience question: When will the ERM become integrated into the ILS? Response: System vendors are listening, and the development cycle is dependent on customer input. Every library approaches their record keeping in different ways.

NASIG 2009: Managing Electronic Resource Statistics

Presenter: Nancy Beals

We have the tools and the data, now we need to use them to the best advantage. Statistics, along with other data, can create a picture of how our online resources are being used.

Traditionally, we have gathered stats by counting when re-shelving, ILL, gate counts, circulation, etc. Do these things really tell us anything? Stats from eresources can tell us much more, in conjunction with information about the paths we create to them.

Even with standards, we can run into issues with collecting data. Data can be “unclean” or incorrectly reported (or late). And, not all publishers are using the standards (i.e. COUNTER).

After looking at existing performance indicators, applying them to electronic resources, then we can look at trends with our electronic resources. This can help us with determining the return on investment in these resources.

Keep a master list of stats in order to plan out how and when to gather them. Keep the data in a shared location. Be prepared to supply data in a timely fashion for collection development decision-making.

When you are comparing resources, it’s up to individual institutions to determine what is considered low or high use. Look at how the resources stack up within the over-all collection.

When assessing the value of a resource, Beals and her colleagues are looking at 2-3 years of use data, 10% cost inflation, and the cost of ILL. In addition, they make use of overlap analysis tools to determine where they have multiple formats or sources that could be eliminated based on which platforms are being used.

Providing readily accessible data in a user-friendly format empowers selectors to do analysis and make decisions.

NASIG 2009: Moving Mountains of Cost Data

Standards for ILS to ERMS to Vendors and Back

Presenter: Dani Roach

Acronyms you need to know for this presentation: National Information Standards Organization (NISO), Cost of Resource Exchange (CORE), and Draft Standard For Trial Use (DSFTU).

CORE was started by Ed Riding from SirsiDynix, Jeff Aipperspach from Serials Solutions, and Ted Koppel from Ex Libris (and now Auto-Graphics). The saw a need to be able to transfer acquisitions data between systems, so they began working on it. After talking with various related parties, they approached NISO in 2008. Once they realized the scope, it went from being just an ILS to ERMS transfer to also including data from your vendors, agents, consortia, etc, but without duplicating existing standards.

Library input is critical in defining the use cases and the data exchange scenarios. There was also a need for a data dictionary and XML schema in order to make sure everyone involved understood each other. The end result is the NISO CORE DSFTU Z39.95-200x.

CORE could be awesome, but in the mean time, we need a solution. Roach has a few suggestions for what we can do.

Your ILS has a pile of data fields. Your ERMS has a pile of data fields. They don’t exactly overlap. Roach focused on only eight of the elements: title, match point (code), record order number, vendor, fund, what paid for, amount paid, and something else she can’t remember right now.

She developed Access tables with output from her ILS and templates from her ERMS. She then ran a query to match them up and then upload the acquisitions data to her ERMS.

For the database record match, she chose the Serials Solutions three letter database code, which was then put into an unused variable MARC field. For the journals, she used the SSID from the MARC records Serials Solutions supplies to them.

Things that you need to decide in advance: How do you handle multiple payments in a single fiscal year (What are you doing currently? Do you need to continue doing it?)? What about resources that share costs? How will you handle one-time vs. ongoing purchase? How will you maintain the integrity of the match point you’ve chosen?

The main thing to keep in mind is that you need to document your decisions and processes, particularly for when systems change and CORE or some other standard becomes a reality.

Samantha Brennan on I’ve been published!November 30, 2020
What a fascinating sport. We'd love to have you back anytime! Welcome!
FY19 conferences, an update – eclectic librarian on FY19 conferencesJanuary 4, 2019
[…] was very excited to finally have approval to attend the Timberline Acquisitions Institute this year, but turns out […]
quantified self, an addendum – eclectic librarian on the quantified selfMarch 27, 2018
[…] I shared a list of apps and tools I’m using to monitor and track things, mainly health-related. Well, my…