Charleston 2012: Curating a New World of Publishing

Looking through spy glass by Arild Nybø
“Looking through spy glass” by Arild Nybø

Hypothesis: Rapid publishing output and a wide disparity of publishing sources and formats has made finding the right content at the right time harder for librarians.

Speaker: Mark Coker, founder of Smashwords

Old model of publishing was based on scarcity, with publishers as mediators for everything. Publishers aren’t in the business of publishing books, they are in the business of selling books, so they really focus more on what books they think readers want to read. Ebook self publishing overcomes many of the limitations of traditional publishing.

Users want flexibility. Authors want readers. Libraries want books accessible to anyone, and they deliver readership.

The tools for self publishing are now free and available to anyone around the world. The printing press is now in the cloud. Smashwords will release about 100,000 new books in 2012, and they are hitting best seller lists at major retailers and the New York Times.

How do you curate this flood? Get involved at the beginning. Libraries need to also promote a culture of authorship. Connect local writers with local readers. Give users the option to publish to the library. Emulate the best practices of the major retailers. Readers are the new curators, not publishers.

Smashwords Library Direct is a new service they are offering.

Speaker: Eric Hellman, from Unglue.it

[Missed the first part as I sought a more comfortable seat.]

They look for zero margin distribution solutions by connecting publishers and libraries. They do it by running crowd-funded pledge drive for every book offer, much like Kickstarter. They’ve been around since May 2012.

For example, Oral Literature in Africa was published by Oxford UP in 1970, and it’s now out of print with the rights reverted to the author. The rights holder set a target amount needed to make the ebook available free to anyone. The successful book is published with a Creative Commons license and made available to anyone via archive.org.

Unglue.it verifies that the rights holder really has the rights and that they can create an ebook. The rights holder retains copyright, and the ebook format is neutral. Books are distributed globally, and distribution rights are not restricted to anyone. No DRM is allowed, so the library ebook vendors are having trouble adopting these books.

This is going to take a lot of work to make it happen, if we just sit and watch it won’t. Get involved.

Speaker: Rush Miller, library director at University of Pittsburgh

Why would a library want to become a publisher? It incentivizes the open access model. It provides services that scholars need and value. It builds collaborations with partners around the world. It improves efficiencies and encourages innovation in scholarly communications.

Began by collaborating with the university press, but it focuses more on books and monographs than journals. The library manages several self-archiving repositories, and they got into journal publishing because the OJS platform looked like something they could handle.

They targeted diminishing circulation journals that the university was already invested in (authors, researchers, etc.) and helped them get online to increase their circulation. They did not charge the editors/publishers of the journals to do it, and encouraged them to move to open access.

Pandora Town Hall (Richmond, VA)

Open question/answer forum with Tim Westergren, the founder of the Music Genome Project and Pandora Internet Radio.

June 29, 2009
approx 100 attending
free t-shirts! free burritos from Chipotle!

Tim Westergren, founder of Pandora

His original plan was to get in a car & drive across country to find local music to add to Pandora, but it wasn’t quite as romantic as he thought it would be. On the way home, he planned a meetup on the fly using the Pandora blog, and since then, whenever he visits a new city, he organizes get together like this one.

Tim is a Stanford graduate and a musician, although he didn’t study it specifically. He spent most of his 20s playing in bands, touring around the country, but not necessarily as a huge commercial success. It’s hard to get on the radio, and radio is the key to professional longevity. Eventually, he shifted to film score composition, which required him to analyze music and break it down into components that represent what is happening on the screen. This generated the idea of a musical genome.

The Music Genome Project was launched in 2000 with some seed money that lasted about a year. Eventually, they ran out of money and couldn’t pay their 45 employees. They tried several different ways to raise money, but nothing worked until some venture investors put money into it in 2004. At that point, they took the genome and repurposed it into a radio (Pandora) in 2005.

They have never advertised — it has all been word of mouth. They now add about 65,000 new listeners per day! They can see profitability on the horizon. Pandora is mainly advertising supported. The Amazon commissions provide a little income, but not as much as you might think they would.

There are about 75,000 artists on the site, and about 70% are not on a major label. The song selection is not based on popularity, like most radio, but rather on the elements of the songs and how they relate to what the user has selected.

Playlists are initially created by the song or artists musical proximity to begin with, and then is refined as the user thumbs up or down songs. Your thumbs up and down effect only the station you are listening to, and it effects whatever the rest of the playlist was going to be. They use the over-all audience feedback to adjust across the site, but it’s not as immediate or personalized.

They have had some trouble with royalties. They pay both publishing and performer royalties per song. They operate under the DMCA, including the royalty structure. Every five years, a committee determines what the rate will be for the next year. In July 2007, the committee decided to triple the ratings and made it retroactive. It essentially bankrupted the company.

Pandora called upon the listeners to help them by contacting their congressional representative to voice opposition to the decision. Congress received 400,000 faxes in three days, breaking the structure on the Hill for a week! Their phones were ringing all day long! Eventually, they contacted Pandora to make it stop. They are now finishing up what needs to be done to bring the royalty back to something more reasonable. (Virtually all the staffers on Capitol Hill are Pandora users — made it easy to get appointments with congress members.)

Music comes to Pandora from a variety of sources. They get a pile of physical and virtual submissions from artists. They also pay attention to searches that don’t result in anything in their catalog, as well as explicit suggestions from listeners.

They have a plan to offer musicians incentives to participate. For example, if someone thumbs up something, there would be a pop-up that suggest checking out a similar (or the same) band that is playing locally. Most of the room would opt into emails that let them know when bands they like are coming to town. Musicians could see what songs are being thumbed up or down and where the listeners are located.

Listener suggestion: on the similar artists pages, provide more immediate sampling of recommendations.

What is the cataloging backlog? It takes about 8-10 weeks, and only about 30% of what is submitted makes it in. They select based on quality: for what a song is trying to do, does it do it well? They know when they’ve made a wrong decision if they don’t include something and a bunch of people search for it.

Pandora is not legal outside of the US, but many international users fake US zip codes. However, in order to avoid lawsuits, they started blocking by IP. As soon as they implemented IP blocking, they received a flood of messages, including one from a town that would have “Pandora night” at a local club. (The Department of Defense called up and asked them to block military IP ranges because Pandora was hogging the bandwidth!)

Why are some songs quieter than others? Tell them. They should be correcting for that.

The music genome is used by a lot of scorers and concert promoters to find artists and songs that are similar to the ones they want.

Could the users be allowed more granular ratings rather than thumbing up or down whole songs? About a third of the room would be interested in that.

Mobile device users are seeing fewer advertisements, and one listener is concerned that this will impact revenue. Between the iPhone, the Blackberry, and the Palm Pre, they have about 45,000 listeners on mobile devices. This is important to them, because these devices will be how Pandora will get into listener’s cars. And, in actuality, mobile listeners interact with advertisements four times as much as web listeners.

Tim thinks that eventually Pandora will host local radio. I’m not so sure how that would work.

Subscription Pandora is 192kbps, which sounds pretty good (and it comes with a desktop application). It’s not likely to get to audiophile level until the pipes are big enough to handle the bandwidth.

Variety and repetition is their biggest areas where they get feedback from listeners. The best way to get variety is to add different artists. If you thumb down an artist three times, they should be removed from the station.

They stream about 1/3 of the data that YouTube streams daily, with around 100 servers. Tim is not intimately familiar with the tech that goes into make Pandora work.

[The questions kept coming, but I couldn’t stay any longer, unfortunately. If you have a chance to attend a Pandora Town Hall, do it!]

LITA 2008: What is "Social Cataloging" and Why Should You Care?

“Having games in the library strikes me as being like having bocce in the frat house.”

Speaker: Tim Spalding, Founder of LibraryThing

“I have no practical advice for you, but I have inspiration and screen shots.” Such as, images from Dr. Horrible’s Sing-Along Blog and book pile photo submissions.

Social cataloging does not need to be defined. LibraryThing is a good example of social cataloging, but it’s not the only resource out there like that. (LibraryThing is now larger than the Library of Congress.) Good Reads focuses more on the social aspects, and Shelfari is being revived by Amazon. There are other sites like CiteULike and Last.fm that do social cataloging of things other than books.

Social cataloging explores the socialization. LibraryThing embraces the social and the digital because there is no physical aspect (except for what you have in your own collection).

Social cataloging ladder:

  • personal cataloging – your stuff
  • exhibitionism, voyeurism – about you and your stuff
  • self expression – book pile photos, reviews
  • implicit social cataloging – tag clouds on books that incorporate data from all owners, recommendations, connect with other owners of more obscure books
  • social networking – “friends” lists, users who share your books, groups
  • sharing – book covers of different editions, author photos
  • explicit social cataloging – work-level records (any title you would agree on at a cocktail party) for both books and authors, series data
  • collaborative cataloging – building the catalogs of famous dead people, developing an open-source alternative to Dewey

Regarding why Spalding felt it necessary to pull data from libraries and not just Amazon, he says, “Once you are over the age of 30 and you are not a Philistine, you have books that Amazon is not currently selling.”

Interesting factoid about how things are tagged on LibraryThing: LGBT and GLBT tags have two completely different lists of books.

Traditional cataloging is based on the physical form of cataloging with cards. It was too difficult to change subjects or to add weight to particular subjects because you couldn’t do that with physical cards. We need to get away from this now that we have all the flexibility of digital cataloging. Digital cataloging is social cataloging.

LibraryThing users are doing about 1,000 work combinations per day! Voluntarily! Experts on book topics are the ones pulling the data together, not experts on cataloging.

LibraryThing members figured out what books are on Dr. Horrible’s shelf based on a fuzzy still from the video. And then the guy who lives in the apartment where it was filmed corrected the editions listed.

There are many non-librarians who are passionate about books and classification. People care about libraries and library data.

On the other hand, we suck. Our catalogs are fundamentally not open to the web because our pages are often session-specific and not friendly to index spiders. Worldcat.org is getting fewer visitors, whereas Dogster.com is getting more.

Library 2.0 is in danger. Libraries are concentrating on what they can do, not what they can do best. We don’t need to have blogs or pages on Facebook. “Having games in the library strikes me as being like having bocce in the frat house.”

Do not pay anyone for Library 2.0 stuff. Do it yourself. OCLC is not yourself.

Or, pay Spalding for his 2.0 enhancements (LibraryThing for Libraries).

Social cataloging is about the catalog, about what you can do right now, about passion, and about giving (not taking).