CIL 2010: Library Engagement Through Open Data

Speakers: Oleg Kreymer & Dan Lipcan

Library data is meaningless in and of itself – you need to interpret it to give it meaning. Piotr Adamczyk did much of the work for the presentation, but was not able to attend today due to a schedule conflict.

They created the visual dashboard for many reasons, including a desire to expose the large quantities of data they have collected and stored, but in a way that is interesting and explanatory. It’s also a handy PR tool for promoting the library to benefactors, and to administrators who are often not aware of the details of where and how the library is being effective and the trends in the library. Finally, the data can be targeted to the general public in ways that catch their attention.

The dashboard should also address assessment goals within the library. Data visualization allows us to identify and act upon anomalies. Some visualizations are complex, and you should be sensitive to how you present it.

The ILS is a great source of circulation/collections data. Other statistics can come from the data collected by various library departments, often in spreadsheet format. Google Analytics can capture search terms in catalog searches as well as site traffic data. Download/search statistics from eresources vendors can be massaged and turned into data visualizations.

The free tools they used included IMA Dashboard (local software, Drupal Profile) and IBM Many Eyes and Google Charts (cloud software). The IMA Dashboard takes snapshots of data and publishes it. It’s more of a PR tool.

Many Eyes is a hosted collection of data sets with visualization options. One thing I like was that they used Google Analytics to gather the search terms used on the website and presented that as a word cloud. You could probably do the same with the titles of the pages in a page hit report.

Google Chart Tools are visualizations created by Google and others, and uses Google Spreadsheets to store and retrieve the data. The motion charts are great for showing data moving over time.

Lessons learned… Get administrative support. Identify your target audience(s). Identify the stories you want to tell. Be prepared for spending a lot of time manipulating the data (make sure it’s worth the time). Use a shared repository for the data documents. Pull from data your colleagues are already harvesting. Try, try, and try again.

CIL 2010: Google Wave

Presenters: Rebecca Jones & Bob Keith

Jones was excited to have something that combined chat with cloud applications like Google Docs. Wave is a beginning, but still needs work. Google is not risk-averse, so they put it out and let us bang on it to shape it into something useful.

More people joined Google Wave and abandoned it than those who stuck with it (less than 10% of the room). We needed something that would push us over to incorporating it into our workflows, and we didn’t see that happen.

The presenters created a public wave, which you can find by searching “with:public tag:cil2010”. Ironically, they had to close Wave in order to have enough virtual memory to play the video about Wave.

Imagine that! Google Wave works better in Google Chrome than in other browsers (including Firefox with the Gears extension).

Gadgets add functionality to waves. [note: I’ve also seen waves that get bogged down with too many gadgets, so use them sparingly.] There are also robots that can do tasks, but it seems to be more like text-based games, which have some retro-chic, but no real workflow application.

Wave is good for managing a group to-do list or worklog, planning events, taking and sharing meeting notes, and managing projects. However, all participants need to be Wave users. And, it’s next to impossible to print or otherwise archive a Wave.

The thing to keep in mind with Wave is that it’s not a finished product and probably shouldn’t be out for public consumption yet.

The presentation (available at the CIL website and on the wave) also includes links to a pile of resources for Wave.

ER&L 2010: Where are we headed? Tools & Technologies for the future

Speakers: Ross Singer & Andrew Nagy

Software as a service saves the institution time and money because the infrastructure is hosted and maintained by someone else. Computing has gone from centralized, mainframe processing to an even mix of personal computers on an networked enterprise to once again a very centralized environment with cloud applications and thin clients.

Library resource discovery is, to a certain extent, already in the cloud. We use online databases and open web search, WorldCat, and next gen catalog interfaces. The next gen catalog places the focus on the institution’s resources, but it’s not the complete solution. (People see a search box and they want to run queries on it – doesn’t matter where it is or what it is.) The next gen catalog is only providing access to local resources, and while it looks like modern interfaces, the back end is still old-school library indexing that doesn’t work well with keyword searching.

Web-scale discovery is a one-stop shop that provides increased access, enhances research, and provides and increase ROI for the library. Our users don’t use Google because it’s Google, they use it because it’s simple, easy, and fast.

How do we make our data relevant when administration doesn’t think what we do is as important anymore? Linked data might be one solution. Unfortunately, we don’t do that very well. We are really good at identifying things but bad at linking them.

If every component of a record is given identifiers, it’s possible to generate all sorts of combinations and displays and search results via linking the identifiers together. RDF provides a framework for this.

Also, once we start using common identifiers, then we can pull in data from other sources to increase the richness of our metadata. Mashups FTW!

ER&L 2010: We’ve Got Issues! Discovering the right tool for the job

Speaker: Erin Thomas

The speaker is from a digital repository, so the workflow and needs may be different than your situation. Their collections are very old and spread out among several libraries, but are still highly relevant to current research. They have around 15 people who are involved in the process of maintaining the digital collection, and email got to be too inefficient to handle all of the problems.

The member libraries created the repository because they have content than needed to be shared. They started with the physical collections, and broke up the work of scanning among the holding libraries, attempting to eliminate duplications. Even so, they had some duplication, so they run de-duplication algorithms that check the citations. The Internet Archive is actually responsible for doing the scanning, once the library has determined if the quality of the original document is appropriate.

The low-cost model they are using does not produce preservation-level scans; they’re focusing on access. The user interface for a digital collection can be more difficult to browse than the physical collection, so libraries have to do more and different kinds of training and support.

This is great, but it caused more workflow problems than they expected. So, they looked at issue tracking problems. Their development staff already have access to Gemini, so they went with that.

The issues they receive can be assigned types and specific components for each problem. Some types already existed, and they were able to add more. The components were entirely customized. Tasks are tracked from beginning to end, and they can add notes, have multiple user responses, and look back at the history of related issues.

But, they needed a more flexible system that allowed them to drill-down to sub-issues, email v. no email, and a better user interface. There were many other options out there, so they did a needs assessment and an environmental scan. They developed a survey to ask the users (library staff) what they wanted, and hosted demos of options. And, in the end, Gemini was the best system available for what they needed.

ER&L 2010: Adventures at the Article Level

Speaker: Jamene Brooks-Kieffer

Article level, for those familiar with link resolvers, means the best link type to give to users. The article is the object of pursuit, and the library and the user collaborate on identifying it, locating it, and acquiring it.

In 1980, the only good article-level identification was the Medline ID. Users would need to go through a qualified Medline search to track down relevant articles, and the library would need the article level identifier to make a fast request from another library. Today, the user can search Medline on their own; use the OpenURL linking to get to the full text, print, or ILL request; and obtain the article from the source or ILL. Unlike in 1980, the user no longer needs to find the journal first to get to the article. Also, the librarian’s role is more in providing relevant metadata maintenance to give the user the tools to locate the articles themselves.

In thirty years, the library has moved from being a partner with the user in pursuit of the article to being the magician behind the curtain. Our magic is made possible by the technology we know but that our users do not know.

Unique identifiers solve the problem of making sure that you are retrieving the correct article. CrossRef can link to specific instances of items, but not necessarily the one the user has access to. The link resolver will use that DOI to find other instances of the article available to users of the library. Easy user authentication at the point of need is the final key to implementing article-level services.

One of the library’s biggest roles is facilitating access. It’s not as simple as setting up a link resolver – it must be maintained or the system will break down. Also, document delivery service provides an opportunity to generate goodwill between libraries and users. The next step is supporting the users preferred interface, through tools like LibX, Papers, Google Scholar link resolver integration, and mobile devices. The latter is the most difficult because much of the content is coming from outside service providers and the institutional support for developing applications or web interfaces.

We also need to consider how we deliver the articles users need. We need to evolve our acquisitions process. We need to be ready for article-level usage data, so we need to stop thinking about it as a single-institutional data problem. Aggregated data will help spot trends. Perhaps we could look at the ebook pay-as-you-use model for article-level acquisitions as well?

PIRUS & PIRUS 2 are projects to develop COUNTER-compliant article usage data for all article-hosting entities (both traditional publishers and institutional repositories). Projects like MESUR will inform these kinds of ventures.

Libraries need to be working on recommendation services. Amazon and Netflix are not flukes. Demand, adopt, and promote recommendation tools like bX or LibraryThing for Libraries.

Users are going beyond locating and acquiring the article to storing, discussing, and synthesizing the information. The library could facilitate that. We need something that lets the user connect with others, store articles, and review recommendations that the system provides. We have the technology (magic) to make it available right now: data storage, cloud applications, targeted recommendations, social networks, and pay-per-download.

How do we get there? Cover the basics of identify>locate>acquire. Demand tools that offer services beyond that, or sponsor the creation of desired tools and services. We also need to stay informed of relevant standards and recommendations.

Publishers will need to be a part of this conversation as well, of course. They need to develop models that allow us to retain access to purchased articles. If we are buying on the article level, what incentive is there to have a journal in the first place?

For tenure and promotion purposes, we need to start looking more at the impact factor of the article, not so much the journal-level impact. PLOS provides individual article metrics.

IL2009: Technology: The Engine Driving Pop Culture-Savvy Libraries or Source of Overload?

Speaker: Elizabeth Burns

Technology and pop culture drive each other. Librarians sometimes assume that people using technology like smart phones in libraries are wasting time, both theirs and ours, but we really don’t know how they are using tech. Librarians need to learn how to use the tech that their user community employs, so don’t hinder your staff by limiting what tech they can use while in the workplace.

Libraries also have the responsibility to inform users of the services and technology available to them. Get the tools, learn how to use them, and then get to work building things with them.

Your library’s tech trendspotting group needs more than just the techie people. Get the folks who aren’t as excited about the shiny to participate and ask questions. Don’t let the fear of Betamax stop you – explore new devices and delivery methods now rather than waiting to find out if they have longevity. You never know what’s going to stick.

Speaker: Sarah Houghton-Jan

"Information overload is the Devil"

Some people think that it didn’t exist before mobile phones and home computers, but the potential has always existed. Think about the piles of books you’ve acquired but haven’t read yet. Information overload is all of the piles of things you want to learn but haven’t yet.

"We have become far more proficient in generating information than we are in managing it…"

Librarians are more equipped to handle information overload than most others. Manage your personal information consumption with the same kind of tools and skills you use in your professional life.

Some of the barriers to dealing with information overload are: lack of time or (a perceived lack of time), lack of interest or motivation, not being encouraged/threatened by management, not knowing where to start, and frustration with past attempts. Become like the automatic towel dispensers that have the towels already dispensed and ready to be torn off as needed.

Inventory your inputs and devices. Think before you send/subscribe. Schedule yourself, including unscheduled work and tasks. Use downtime (bring tech that helps you do it). Stay neat. Keep a master waiting list of things that other people "owe" you, and then periodically follow-up on them. Weed, weed, and weed again. Teach others communication etiquette (and stick to it). Schedule unplugged times, and unplug at will.

RSS/Twitter overload: Limit your feeds and following, and regularly evaluate them. Use lists to organize feeds and Twitter friends. Use RSS when applicable, and use it to send you reminders.

Interruptive technology (phone, IM, texts, Twitter, etc): Use them only when they are appropriate for you. Check it when you want to, and don’t interrupt yourself. Use your status message. Lobby for IM or Twitter at your workplace (as an alternative to phone or email, for the status message function & immediacy). Keep your phone number private. Let it ring if you are busy. Remember that work is at work and home is at home, and don’t mix the two.

Email: Stop "doing email" — start scheduling email scanning time, use it when appropriate, and deal with it by subject. Keep your inbox nearly empty and filter your messages. Limit listservs. Follow good email etiquette. Delete and archive, and keep work and personal email separate.

Physical items: Just because you can touch it, doesn’t mean you should keep it. Cancel, cancel, cancel (catalogchoice.org). Weed what you have.

Multimedia: Choose entertainment thoughtfully. Limit television viewing and schedule your entertainment time. Use your commute to your benefit.

Social networking: Schedule time on your networks. Pick a primary network and point other sites towards it. Limit your in-network IM.

Time & stress management: Use your calendar. Take breaks. Eliminate stressful interruptions. Look for software help. Balance your life and work to your own liking, not your boss’s or your spouse’s.

[Read Lifehacker!]

IL2009: Mashups for Library Data

Speakers: Nicole Engard

Mashups are easy ways to provide better services for our patrons. They add value to our websites and catalogs. They promote our services in the places our patrons frequent. And, it’s a learning experience.

We need to ask our vendors for APIs. We’re putting data into our systems, so we should be able to get it out. Take that data and mash it up with popular web services using RSS feeds.

Yahoo Pipes allows you to pull in many sources of data and mix it up to create something new with a clean, flow chart like interface. Don’t give up after your first try. Jody Fagan wrote an article in Computers in Libraries that inspired Engard to go back and try again.

Reading Radar takes the NYT Bestseller lists and merges it with data from Amazon to display more than just sales information (ratings, summaries, etc.). You could do that, but instead of having users go buy the book, link it to your library catalog. The New York Times has opened up a tremendous amount of content via APIs.

Bike Tours in CA is a mashup of Google Maps and ride data. Trulia, Zillow, and HousingMaps use a variety of sources to map real estate information. This We Know pulls in all sorts of government data about a location. Find more mashups at ProgrammableWeb.

What mashups should libraries be doing? First off, if you have multiple branches, create a Google Maps mashup of library locations. Share images of your collection on Flickr and pull that into your website (see Access Ceramics), letting Flickr do the heavy lifting of resizing the images and pulling content out via machine tags. Delicious provides many options for creating dynamically updating lists with code snippets to embed them in your website.

OPAC mashups require APIs, preferably those that can generate JavaScript, and finally you’ll need a programmer if you can’t get the information out in a way you can easily use it. LexisNexis Academic, WorldCat, and LibraryThing all have APIs you can use.

Ideas from Librarians: Mashup travel data from circulation data and various travel sources to provide better patron services. Grab MARC location data to plot information on a map. Pull data about media collection and combine it with IMDB and other resources. Subject RSS feeds from all resources for current articles (could do that already with a collection of journals with RSS feeds and Yahoo Pipes).

Links and more at her book website.

IL2009: Mobile Content Delivery in the Enterprise

Speakers: Britt Mueller

Often, there are more librarians who’s organizations loan ebook readers to their users than who own or use ebook readers themselves. Devices are driving all of the changes in the content, and we need to pay attention to that.

General Mills launched their ebook reader lending program in the fall of 2008 with six Kindles pre-loaded with content and attached to a purchasing card registered with each device. They’ve had over 120 loans over the past year with a wait list (two week loan periods).

Qualcomm launched a similar program at around the same time, but they went with four types of ereaders: Kindle, Sony 505, Bookeen Cybook, and Irex Iliad). They’ve had over 500 loans over the past year with a wait list, and they’ve updated the devices with the newer models as they were released.

One of the down sides to the devices is that there is no enterprise model. Users have to go through the vendor to get content, rather than getting the content directly from the library. Users liked the devices but wanted them to be as customized to their individual preferences and yet still shareable, much like borrowing other devices like laptops and netbooks from the library.

There is a uniform concern among publishers and vendors for how to track/control usage in order to pay royalties, which makes untethering the content problematic. There is a lack of standardization in format, which makes converting content to display on a wide range of devices problematic as well. And finally, the biggest stumbling block for libraries is a lack of an enterprise model for acquiring and sharing content on these devices.

Implications for the future: integration into the ILS, staff time to manage the program, cost, and eventually, moving away from lending devices and moving towards lending the content.

IL2009: Collaboration in the Clouds

Presenter: Tom Ipri

How will cloud computing impact the library as a space? Will we be able to provide the infrastructure to support collaborative computing within our buildings or resource networks?

Virtual computing labs allow students to access their software, settings, and files from any computer on campus. However, there are concerns about reliability, privacy, and the security of data. If you are sending your students to services outside of the university, what impacts are there on the policies of the university?

Who needs libraries when everything is in the cloud? The library can become fully both a warehouse and a gathering place.

IL2009: Cloud Computing in Practice: Creating Digital Services & Collections

Speakers: Amy Buckland, Kendra K. Levine, & Laura Harris (icanhaz.com/cloudylibs)

Cloud computing is a slightly complicated concept. Everyone approaches defining it from different perspectives. It’s about data and storage. For the purposes of this session, they mean any service that is on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service.

Cloud computing frees people to collaborate in many ways. Infrastructure is messy, so let someone else take care of that so you can focus on what you really need to do. USB sticks can do a lot of that, but they’re easy to lose, and data in the cloud will hopefully be migrated to new formats.

The downside of cloud computing is that it is so dependent upon constant connection and uptime. If your cloud computing source or network goes down, you’re SOL until it get fixed. Privacy can also be a legitimate concern, and the data could be vulnerable to hacking or leaks. Nothing lasts forever — for example, today, Geocities is closing.

Libraries are already in the cloud. We often store our ILS data, ILL, citation management, resource guides, institutional repositories, and electronic resource management tools on servers and services that do not live in the library. Should we be concerned about our vendors making money from us on a "recurring, perpetual basis" (Cory Doctorow)? Should we be concerned about losing the "face" of the library in all of these cloud services? Should we be concerned about the reliability of the services we are paying for?

Libraries can use the cloud for data storage (i.e. DuraSpace, Dropbox). They could also replace OS services & programs, allowing patron-access computers to b run using cloud applications.

Presentation slides are available at icanhaz.com/cloudylibs.

Speaker: Jason Clark

His library is using four applications to serve video from the library, and one of them is TerraPod, which is for students to create, upload, and distribute videos. They outsourced the player to Blip.tv. This way, they don’t have to encode files or develop a player.

The way you can do mashups of cloud applications and locally developed applications is through the APIs that defines the rules for talking to the remote server. The cloud becomes the infrastructure that enables webscaling of projects. Request the data, receive it in some sort of structured format, and then parse it out into whatever you want to do with it.

Best practices for cloud computing: use the cloud architecture do the heavy lifting (file conversion, storage, distribution, etc.), archive locally if you must, and outsource conversion. Don’t be afraid. This is the future.

Presentation slides will be available later on his website.

css.php