CIL 2010: Library Engagement Through Open Data

Speakers: Oleg Kreymer & Dan Lipcan

Library data is meaningless in and of itself – you need to interpret it to give it meaning. Piotr Adamczyk did much of the work for the presentation, but was not able to attend today due to a schedule conflict.

They created the visual dashboard for many reasons, including a desire to expose the large quantities of data they have collected and stored, but in a way that is interesting and explanatory. It’s also a handy PR tool for promoting the library to benefactors, and to administrators who are often not aware of the details of where and how the library is being effective and the trends in the library. Finally, the data can be targeted to the general public in ways that catch their attention.

The dashboard should also address assessment goals within the library. Data visualization allows us to identify and act upon anomalies. Some visualizations are complex, and you should be sensitive to how you present it.

The ILS is a great source of circulation/collections data. Other statistics can come from the data collected by various library departments, often in spreadsheet format. Google Analytics can capture search terms in catalog searches as well as site traffic data. Download/search statistics from eresources vendors can be massaged and turned into data visualizations.

The free tools they used included IMA Dashboard (local software, Drupal Profile) and IBM Many Eyes and Google Charts (cloud software). The IMA Dashboard takes snapshots of data and publishes it. It’s more of a PR tool.

Many Eyes is a hosted collection of data sets with visualization options. One thing I like was that they used Google Analytics to gather the search terms used on the website and presented that as a word cloud. You could probably do the same with the titles of the pages in a page hit report.

Google Chart Tools are visualizations created by Google and others, and uses Google Spreadsheets to store and retrieve the data. The motion charts are great for showing data moving over time.

Lessons learned… Get administrative support. Identify your target audience(s). Identify the stories you want to tell. Be prepared for spending a lot of time manipulating the data (make sure it’s worth the time). Use a shared repository for the data documents. Pull from data your colleagues are already harvesting. Try, try, and try again.