Charleston 2012: Curating a New World of Publishing

Looking through spy glass by Arild Nybø
“Looking through spy glass” by Arild Nybø

Hypothesis: Rapid publishing output and a wide disparity of publishing sources and formats has made finding the right content at the right time harder for librarians.

Speaker: Mark Coker, founder of Smashwords

Old model of publishing was based on scarcity, with publishers as mediators for everything. Publishers aren’t in the business of publishing books, they are in the business of selling books, so they really focus more on what books they think readers want to read. Ebook self publishing overcomes many of the limitations of traditional publishing.

Users want flexibility. Authors want readers. Libraries want books accessible to anyone, and they deliver readership.

The tools for self publishing are now free and available to anyone around the world. The printing press is now in the cloud. Smashwords will release about 100,000 new books in 2012, and they are hitting best seller lists at major retailers and the New York Times.

How do you curate this flood? Get involved at the beginning. Libraries need to also promote a culture of authorship. Connect local writers with local readers. Give users the option to publish to the library. Emulate the best practices of the major retailers. Readers are the new curators, not publishers.

Smashwords Library Direct is a new service they are offering.

Speaker: Eric Hellman, from Unglue.it

[Missed the first part as I sought a more comfortable seat.]

They look for zero margin distribution solutions by connecting publishers and libraries. They do it by running crowd-funded pledge drive for every book offer, much like Kickstarter. They’ve been around since May 2012.

For example, Oral Literature in Africa was published by Oxford UP in 1970, and it’s now out of print with the rights reverted to the author. The rights holder set a target amount needed to make the ebook available free to anyone. The successful book is published with a Creative Commons license and made available to anyone via archive.org.

Unglue.it verifies that the rights holder really has the rights and that they can create an ebook. The rights holder retains copyright, and the ebook format is neutral. Books are distributed globally, and distribution rights are not restricted to anyone. No DRM is allowed, so the library ebook vendors are having trouble adopting these books.

This is going to take a lot of work to make it happen, if we just sit and watch it won’t. Get involved.

Speaker: Rush Miller, library director at University of Pittsburgh

Why would a library want to become a publisher? It incentivizes the open access model. It provides services that scholars need and value. It builds collaborations with partners around the world. It improves efficiencies and encourages innovation in scholarly communications.

Began by collaborating with the university press, but it focuses more on books and monographs than journals. The library manages several self-archiving repositories, and they got into journal publishing because the OJS platform looked like something they could handle.

They targeted diminishing circulation journals that the university was already invested in (authors, researchers, etc.) and helped them get online to increase their circulation. They did not charge the editors/publishers of the journals to do it, and encouraged them to move to open access.

NASIG 2010: Publishing 2.0: How the Internet Changes Publications in Society

Presenter: Kent Anderson, JBJS, Inc

Medicine 0.1: in dealing with the influenza outbreak of 1837, a physician administered leeches to the chest, James’s powder, and mucilaginous drinks, and it worked (much like take two aspirin and call in the morning). All of this was written up in a medical journal as a way to share information with peers. Journals have been the primary source of communicating scholarship, but what the journal is has become more abstract with the addition of non-text content and metadata. Add in indexes and other portals to access the information, and readers have changed the way they access and share information in journals. “Non-linear” access of information is increasing exponentially.

Even as technology made publishing easier and more widespread, it was still producers delivering content to consumers. But, with the advent of Web 2.0 tools, consumers now have tools that in many cases are more nimble and accessible than the communication tools that producers are using.

Web 1.0 was a destination. Documents simply moved to a new home, and “going online” was a process separate from anything else you did. However, as broadband access increases, the web becomes more pervasive and less a destination. The web becomes a platform that brings people, not documents, online to share information, consume information, and use it like any other tool.

Heterarchy: a system of organization replete with overlap, multiplicity, mixed ascendandacy and/or divergent but coextistent patterns of relation

Apomediation: mediation by agents not interposed between users and resources, who stand by to guide a consumer to high quality information without a role in the acquisition of the resources (i.e. Amazon product reviewers)

NEJM uses terms by users to add related searches to article search results. They also bump popular articles from searches up in the results as more people click on them. These tools improved their search results and reputation, all by using the people power of experts. In addition, they created a series of “results in” publications that highlight the popular articles.

It took a little over a year to get to a million Twitter authors, and about 600 years to get to the same number of book authors. And, these are literate, savvy users. Twitter & Facebook count for 1.45 million views of the New York Times (and this is a number from several years ago) — imagine what it can do for your scholarly publication. Oh, and NYT has a social media editor now.

Blogs are growing four times as fast as traditional media. The top ten media sites include blogs and the traditional media sources use blogs now as well. Blogs can be diverse or narrow, their coverage varies (and does not have to be immediate), they are verifiably accurate, and they are interactive. Blogs level that media playing field, in part by watching the watchdogs. Blogs tend to investigate more than the mainstream media.

It took AOL five times as long to get to twenty million users than it did for the iPhone. Consumers are increasingly adding “toys” to their collection of ways to get to digital/online content. When the NEJM went on the Kindle, more than just physicians subscribed. Getting content into easy to access places and on the “toys” that consumers use will increase your reach.

Print digests are struggling because they teeter on the brink of the daily divide. Why wait for the news to get stale, collected, and delivered a week/month/quarter/year later? People are transforming. Our audiences don’t think of information as analogue, delayed, isolated, tethered, etc. It has to evolve to something digital, immediate, integrated, and mobile.

From the Q&A session:

The article container will be here for a long time. Academics use the HTML version of the article, but the PDF (static) version is their security blanket and archival copy.

Where does the library as source of funds when the focus is more on the end users? Publishers are looking for other sources of income as library budgets are decreasing (i.e. Kindle, product differentiation, etc.). They are looking to other purchasing centers at institutions.

How do publishers establish the cost of these 2.0 products? It’s essentially what the market will bear, with some adjustments. Sustainability is a grim perspective. Flourishing is much more positive, and not necessarily any less realistic. Equity is not a concept that comes into pricing.

The people who bring the tremendous flow of information under control (i.e. offer filters) will be successful. One of our tasks is to make filters to help our users manage the flow of information.

IL2009: Mashups for Library Data

Speakers: Nicole Engard

Mashups are easy ways to provide better services for our patrons. They add value to our websites and catalogs. They promote our services in the places our patrons frequent. And, it’s a learning experience.

We need to ask our vendors for APIs. We’re putting data into our systems, so we should be able to get it out. Take that data and mash it up with popular web services using RSS feeds.

Yahoo Pipes allows you to pull in many sources of data and mix it up to create something new with a clean, flow chart like interface. Don’t give up after your first try. Jody Fagan wrote an article in Computers in Libraries that inspired Engard to go back and try again.

Reading Radar takes the NYT Bestseller lists and merges it with data from Amazon to display more than just sales information (ratings, summaries, etc.). You could do that, but instead of having users go buy the book, link it to your library catalog. The New York Times has opened up a tremendous amount of content via APIs.

Bike Tours in CA is a mashup of Google Maps and ride data. Trulia, Zillow, and HousingMaps use a variety of sources to map real estate information. This We Know pulls in all sorts of government data about a location. Find more mashups at ProgrammableWeb.

What mashups should libraries be doing? First off, if you have multiple branches, create a Google Maps mashup of library locations. Share images of your collection on Flickr and pull that into your website (see Access Ceramics), letting Flickr do the heavy lifting of resizing the images and pulling content out via machine tags. Delicious provides many options for creating dynamically updating lists with code snippets to embed them in your website.

OPAC mashups require APIs, preferably those that can generate JavaScript, and finally you’ll need a programmer if you can’t get the information out in a way you can easily use it. LexisNexis Academic, WorldCat, and LibraryThing all have APIs you can use.

Ideas from Librarians: Mashup travel data from circulation data and various travel sources to provide better patron services. Grab MARC location data to plot information on a map. Pull data about media collection and combine it with IMDB and other resources. Subject RSS feeds from all resources for current articles (could do that already with a collection of journals with RSS feeds and Yahoo Pipes).

Links and more at her book website.

press readers

What kind of reader are you?

This came in my email today:

  1. The Wall Street Journal is read by the people who run the country.
  2. The Washington Post is read by people who think they run the country.
  3. The New York Times is read by people who think they should run the country and who are very good at crossword puzzles.
  4. USA Today is read by people who think they ought to run the country but don’t really understand The New York Times. They do, however, like their statistics shown in pie charts.
  5. The Los Angeles Times is read by people who wouldn’t mind running the country — if they weren’t on a freeway, or playing beach ball, or at a Botox appointment or an audition — and if they didn’t have to leave Southern California to do it.
  6. The Boston Globe is read by people whose parents used to run the country and did a far superior job of it, thank you very much.
  7. The New York Daily News is read by people who aren’t too sure who’s running the country and don’t really care as long as they can get a seat on the train.
  8. The New York Post is read by people who don’t care who’s running the country as long as they do something really scandalous, preferably while intoxicated.
  9. The Miami Herald is read by people who are running another country but need the baseball scores.
  10. The San Francisco Chronicle is read by people who aren’t sure there is a country or that anyone is running it; but if so, they oppose all that they stand for. There are occasional exceptions if the leaders are handicapped minority feminist atheist dwarfs who also happen to be illegal aliens from any other country or galaxy provided, of course, that they are not Republicans.
  11. The National Enquirer is read by people trapped in line at the grocery store.

overloading the ‘net

Will RSS feeds overload the ‘net?

Wired News has a short article about RSS feed readers and the potential they have for increasing web traffic. I knew about this article because it was listed in the RSS feed that I get from Wired. Go figure. Anyway, the author and others are concerned that because aggregators are becoming more and more popular among those who like to read regularly published electronic content, eventually a large chunk of web traffic will consists of desktop aggregators regularly downloading that data throughout the day.

The trouble is, aggregators are greedy. They constantly check websites that use RSS, always searching for new content. Whereas a human reader may scan headlines on The New York Times website once a day, aggregators check the site hourly or even more frequently.

If all RSS fans used a central server to gather their feeds (such as Bloglines or Shrook), then there wouldn’t be as much traffic, because these services check feeds once per hour at most, regardless of the number of subscribers. So, if you have 100 people subscribed to your feed, rather than getting 100 hits every hour (or some other frequency), you would only get one. The article notes two difficulties with this scenario. First, a lot of RSS fans prefer their desktop aggregators to a web-based aggregator such as Bloglines. Second, the Shrook aggregator is not free, and probably that will be the model that its competitors will take.

I don’t completely agree with the premise that having a central server distributing content to feed subscribers will reduce the flow of traffic on the ‘net anymore than it currently is. Whether my aggregator checks my feeds once an hour or whether Bloglines does it for me, I still use up bandwidth when I log in and read the content on the Bloglines site. For some feeds, if I want to read the whole entry or article, I still have to click to the site. Frankly, I think the problem has more to do with aggregators that “are not complying with specifications that reduce how often large files are requested.”

Readers are supposed to check if the RSS file has been updated since the last visit. If there has been no update, the website returns a very small “no” message to the reader.

But Murphy says the programs often don’t remember when they last checked, or use the local computer’s clock instead of the website’s clock, causing the reader to download entries over and over.

Perhaps the best thing for us to do is to educate ourselves about which RSS aggregator we use and how it may affect the bandwidth of the feeds we download through it.