NASIG 2013: Knowledge and Dignity in the Era of Big Data

CC BY 2.0 2013-06-10
“Big Data” by JD Hancock

Speaker: Siva Vaidhyanathan

Don’t try to write a book about fast moving subjects.

He was trying to capture the nature of our relationship to Google. It provides us with a services that are easy to use, fairly dependable, and well designed. However, that level of success can breed hubris. He was interested in how this drives the company to its audacious goals.

It strikes him that what Google claims to be doing is what librarians have been doing for hundreds of years already. He found himself turning to the core practices of librarians as a guideline for assessing Google.

Why is Google interested in so much stuff? What is the payoff to organizing the world’s information and making it accessible?

Big data is not a phrase that they use much, but the notion is there. More and faster equals better. Google is in the prediction/advertising business. The Google books project is their attempt to reverse engineer the sentence. Knowing how sentences work, they can simulate how to interpret and create sentences, which would be a simulation of artificial intelligence.

The NSA’s deals that give them a backdoor to our data services creates data insecurity, because if they can get in, so can the bad guys. Google keeps data about us (and has to turn it over when asked) because it benefits their business model, unlike libraries who don’t keep patron records in order to protect their privacy.

Big data means more than a lot of data. It means that we have so many instruments to gather data, cheap/ubiquitous cameras and microphones, GPS devices that we carry with us, credit card records, and more. All of these ways of creating feed into huge servers that can store the data with powerful algorithms that can analyze it. Despite all of this, there is no policy surrounding this, nor conversations about best ways to manage this in light of the impact on personal privacy. There is no incentive to curb big data activities.

Scientists are generally trained to understand that correlation is not causation. We seem to be happy enough to draw pictures with correlation and move on to the next one. With big data, it is far too easy to stop at correlation. This is a potentially dangerous way of understanding human phenomenon. We are autonomous people.

The panopticon was supposed to keep prisoners from misbehaving because they assumed they were always being watched. Foucault described the modern state in the 1970s as the panopticon. However, at this point, it doesn’t quite match. We have a cryptopticon, because we aren’t allowed to know when we are being watched. It wants us to be on our worst behavior. How can we inject transparency and objectivism into this cryptopticon?

Those who can manipulate the system will, but those who don’t know how or that it is happening will be negatively impacted. If bad credit can get you on the no-fly list, what else may be happening to people who make poor choices in one aspect of their lives that they don’t know will impact other aspects? There is no longer anonymity in our stupidity. Everything we do, or nearly so, is online. Mistakes of teenagers will have an impact on their adult lives in ways we’ve never experienced before. Our inability to forget renders us incapable of looking at things in context.

Mo Data, Mo Problems

NASIG 2013: Libraries and Mobile Technologies in the Age of the Visible College

“This morning’s audience, seen from the lectern.” by Bryan Alexander

Speaker: Bryan Alexander

NITLE does a lot of research for liberal arts undergraduate type schools. One of the things that he does is publish a monthly newsletter covering trends in higher education, which may be worth paying some attention to (Future Trends). He is not a librarian, but he is a library fanboy.

What is mobile computing doing to the world, and what will it do in the future?

Things have changed rapidly in recent years. We’ve gone from needing telephone rooms at hotels to having phones in every pocket. The icon for computing has gone from desktop to laptop to anything/nothing — computing is all around us in many forms now. The PC is still a useful tool, but there are now so many other devices to do so many other things.

Smartphones are everywhere now, in many forms. We use them for content delivery and capture, and to interact with others through social tools. Over half of Americans now have a smartphone, with less than 10% remaining who have no cell phone, according to Pew. The mobile phone is now the primary communication device for the world. Think about this when you are developing publishing platforms.

The success of the Kindle laid the groundwork for the iPad. Netbooks/laptops now range in size and function.

Clickers are used extensively in the classroom, with great success. They can be used for feedback as well as prompting discussion. They are slowly shifting to using phones instead of separate devices.

Smartpens capture written content digitally as you write them, and you can record audio at the same time. One professor annotates notes on scripts while his students perform, and then provides them with the audio.

Marker-based augmented reality fumbled for a while in the US, but is starting to pick up in popularity. Now that more people have smartphones, QR codes are more prevalent.

The mouse and keyboard have been around since the 1960s, and they are being dramatically impacted by recent changes in technology. Touch screens (i.e. iPad), handhelds (i.e. WII), and nothing (i.e. Kinect).

If the federal government is using it, it is no longer bleeding edge. Ebooks have been around for a long time, in all sorts of formats. Some of the advantages of ebooks include ease of correcting errors, flexible presentation (i.e. font size), and a faster publication cycle. Some disadvantages include DRM, cost, and distribution by libraries.

Gaming has had a huge impact in the past few years. The median age of gamers is 35 or so. The industry size is comparable to music, and has impacts on hardware, software, interfaces, and other industries. There is a large and growing diversity of platforms, topics, genres, niches, and players.

Mobile devices let us make more microcontent (photo, video clip, text file), which leads to the problem of archiving all this stuff. These devices allow us to cover the world with a secondary layer of information. We love connecting with people, and rather than separating us, technology has allowed us to do that even more (except when we focus on our devices more than the people in front of us).

We’re now in a world of information on demand, although it’s not universal. Coverage is spreading, and the gaps are getting smaller.

When it comes to technology, Americans are either utopian or dystopian in our reactions. We’re not living in a middle ground very often. There are some things we don’t understand about our devices, such as multitasking and how that impacts our brain. There is also a generational divide, with our children being more immersed in technology than we are, and having different norms about using devices in social and professional settings.

The ARIS engine allows academics to build games with learning outcomes.

Augmented reality takes data and pins it down to the real world. It’s the inverse of virtual reality. Libraries are going to be the AR engine of the future. Some examples of AR include museum tours, GPS navigators, and location services (Yelp, Foursqure). Beyond that, there are applications that provide data overlaying images of what you point your phone at, such as real estate information and annotations. Google Goggles tries to provide information about objects based on images taken by a mobile device. You could have a virtual art gallery physically tied to a spot, but only displayed when viewed with an app on your phone.

Imagine what the world will be like transformed by the technology he’s been talking about.

1. Phantom Learning: Schools are rare and less needed. The number of people physically enrolled in schools has gone down. Learning on demand is now the thing. Institutions exist to supplement content (adjuncts), and libraries are the media production sites. Students are used to online classes, and un-augmented locations are weird.

II. Open World: Open content is the norm and is very web-centric. Global conversations increase, with more access and more creativity. Print publishers are nearly gone, authorship is mysterious, tons of malware, and privacy is fictitious. The internet has always been open and has never been about money. Identities have always been fictional.

III. Silo World: Most information is experienced in vertical stacks. Open content is almost like public access TV. Intellectual property intensifies, and campuses reorganize around the silos. Students identify with brands and think of “open” as radical and old-fashioned.

ER&L 2013: E-Resources, E-Realities

“Tools” by Josep Ma. Rosell

Speakers: Jennifer Bazeley (Miami University) & Nancy Beals (Wayne State University)

Despite all the research on what we need/want, but no one is building commercial products that meet all our needs and addresses the impediments of cost and dwindling staff.

Beals says that the ERM is not used for workflow, so they needed other tools, with a priority on project management and Excel proficiency. They use an internal listserv, UKSG Transfer, Trello (project management software), and a blog, to keep track of changes in eresources.

Other tools for professional productivity and collaboration: iPads with Remember the Milk or Evernote, Google spreadsheets (project portfolio management organization-wide), and LibGuides.

Bazeley stepped into the role of organizing eresources information in 2009, with no existing tool or hub, which gave her room to experiment. For documentation, they use PBWiki (good for version tracking, particularly to correct errors) with an embedded departmental Google calender. For communication, they use LibGuides for internal documents, and you can embed RSS, Google Docs, Yahoo Pipes aggregating RSS feeds, Google forms for eresource access issues, links to Google spreadsheets with usage data, etc.. For login information, they use KeePass Password Safe. Rather than claiming in the ILS, they’ve moved to using the claim checker tool from the subscription agent.

Tools covered:

  • Google Calendar
  • Google Docs (includes forms & spreadsheets)
  • PBWiki
  • LibGuides
  • Yahoo Pipes
  • WordPress
  • KeePass Password Safe
  • PDF Creator
  • EBSCOnet

Others listed:

  • Blogger (blog software)
  • Mendeley (ref manager)
  • Vimeo (videos)
  • Jing (screenshot/screencast)
  • GIMP (image editor)
  • MediaWiki (Wiki software)
  • LastPass (password manager)
  • OpenOffice (software suite)
  • PDF Creator (PDF manipulation)
  • Slideshare (presentation manager)
  • Filezilla (ftp software)
  • Zoho Creator (database software)
  • Dropbox (cloud storage)
  • Github (software management)
  • Subscription agent software (SwetsWise, EBSCOnet)
  • Microsoft Excel / Access
  • Course Management Software (Moodle, Sakai, Blackboard)
  • Open Source ERMS: ERMes (University of Wisconsin-La Crosse) & CORAL (University of Notre Dame)

Moving Up to the Cloud, a panel lecture hosted by the VCU Libraries

“Sky symphony” by Kevin Dooley

“Educational Utility Computing: Perspectives on .edu and the Cloud”
Mark Ryland, Chief Solutions Architect at Amazon Web Services

AWS has been a part of revolutionizing the start-up industries (i.e. Instagram, Pinterest) because they don’t have the cost of building server infrastructures in-house. Cloud computing in the AWS sense is utility computing — pay for what you use, easy to scale up and down, and local control of how your products work. In the traditional world, you have to pay for the capacity to meet your peak demand, but in the cloud computing world, you can level up and down based on what is needed at that moment.

Economies, efficiencies of scale in many ways. Some obvious: storage, computing, and networking equipment supply change; internet connectivity and electric power; and data center sitting, redundancy, etc. Less obvious: security and compliance best practices; datacenter internal innovations in networking, power, etc.

AWS and .EDU: EdX, Coursera, Texas Digital Library, Berkeley AMP Lab, Harvard Medical, University of Phoenix, and an increasing number of university/school public-facing websites.

Expects that we are heading toward cloud computing utilities to function much like the electric grid — just plug in and use it.


“Libraries in Transition”
Marshall Breeding, library systems expert

We’ve already seen the shift of print to electronic in academic journals, and we’re heading that way with books. Our users are changing in the way they expect interactions with libraries to be, and the library as space is evolving to meet that, along with library systems.

Web-based computing is better than client/server computing. We expect social computing to be integrated into the core infrastructure of a service, rather than add-ons and afterthoughts. Systems need to be flexible for all kinds of devices, not just particular types of desktops. Metadata needs to evolve from record-by-record creation to bulk management wherever possible. MARC is going to die, and die soon.

How are we going to help our researchers manage data? We need the infrastructure to help us with that as well. Semantic web — what systems will support it?

Cooperation and consolidation of library consortia; state-wide implementations of SaaS library systems. Our current legacy ILS are holding libraries back from being able to move forward and provide the services our users want and need.

A true cloud computing system comes with web-based interfaces, externally hosted, subscription OR utility pricing, highly abstracted computing model, provisioned on demand, scaled according to variable needs, elastic.


“Moving Up to the Cloud”
Mark Triest, President of Ex Libris North America

Currently, libraries are working with several different systems (ILS, ERMS, DRs, etc.), duplicating data and workflows, and not always very accurately or efficiently, but it was the only solution for handling different kinds of data and needs. Ex Libris started in 2007 to change this, beginning with conversations with librarians. Their solution is a single system with unified data and workflows.

They are working to lower the total cost of ownership by reducing IT needs, minimize administration time, and add new services to increase productivity. Right now there are 120+ institutions world-wide who are in the process of or have gone live with Alma.

Automated workflows allow staff to focus on the exceptions and reduce the steps involved.

Descriptive analytics are built into the system, with plans for predictive analytics to be incorporated in the future.

Future: collaborative collection development tools, like joint licensing and consortial ebook programs; infrastructure for ad-hoc collaboration


“Cloud Computing and Academic Libraries: Promise and Risk”
John Ulmschneider, Dean of Libraries at VCU

When they first looked at Alma, they had two motivations and two concerns. They were not planning or thinking about it until they were approached to join the early adopters. All academic libraries today are seeking to discover and exploit new efficiencies. The growth of cloud-resident systems and data requires academic libraries to reinvigorate their focus on core mission. Cloud-resident systems are creating massive change throughout out institutions. Managing and exploiting pervasive change is a serious challenge. Also, we need to deal with security and durability of data.

Cloud solutions shift resources from supporting infrastructure to supporting innovation.

Efficiencies are not just nice things, they are absolutely necessary for academic libraries. We are obligated to upend long-held practice, if in doing so we gain assets for practice essential to our mission. We must focus recovered assets on the core library mission.

Agility is the new stability.

Libraries must push technology forward in areas that advance their core mission. Infuse technology evolution for libraries with the values needs of libraries. Libraries must invest assets as developers, development partners, and early adopters. Insist on discovery and management tools that are agnostic regarding data sources.

Managing the change process is daunting.. but we’re already well down the road. It’s not entirely new, but it does involve a change in culture to create a pervasive institutional agility for all staff.

food blogging & making things so labor intensive I don’t do them

derby pie
derby pie

I started a food blog on Tumblr last January. Here’s the about statement:

I started this project because after a year of taking photos of myself every day, I wanted to document something else. Over the summer and fall, I had developed a routine of trying new recipes on the weekends and some weeknights. This blog is where I share photos of the results, talk about what went right or wrong, and link to the recipes.

And sometime in May/June, I stopped. I got busy. I remembered to take some pictures, but they sat on my desktop waiting to be blogged for so long that I felt guilty and overwhelmed, so I eventually deleted them.

It wasn’t like it would take all that much time to write up something. And add a link. And format it the same as the previous posts. But it seemed like a big deal at the time.

Also, I stopped cooking/baking as much in the summer.

I have this tendency to make things that should be simple and routine into complex, detailed processes that become burdensome. Is this just some freak aspect of my desire for control and order, or is it simply human nature?

IL 2012: The Next Big Thing

Moving on
“Moving on” by Craig Allen

Speaker: Dave Hesse & Brian Pichman

They used a Lazer Tag like system to set up “Hunger Games” nights in the library. They also used a bunch of interactive tech toys for different kinds of game nights.

They’re mounting tables as shelf labels that show the range in sleep mode, but when activated will display reviews and other information about books in the range, as well as other interactive multimedia.

Speaker: Sarah Houghton

Cutting stuff. Cutting lots of things out of the budget, services, etc. All of these things we learn about take time and money, and we can’t do all of them. She’s making everyone in her library earn their pet program. It has to show some sort of ROI (not specifically financial). Make business decisions about what we do and why.

Q: What did you cut that you didn’t want to?
A: Magnatune deal — really wanted to do it, but didn’t have the staff time and a negative amount of money to dedicate to anything.

Speaker: Ben Bizzle

We are doing a really poor job of marketing ourselves to our communities, and we’re wasting money on old methods and tools to do it. There are more cost-effective ways to do this, particularly for public libraries. Facebook is a really cost-effective way to market to your community over and over again, and running ads to get people in your community to like your Facebook page has been shown to be very effective. Be part of the stream without being disruptive. Facebook events invitations are disruptive and ineffective.

Next big things from the audience:

  • Would like to have a better way to provide remote authentication for users from anywhere, regardless of the speed of the connection (i.e. 3G mobile phone or a hotel wireless connection).
  • Focusing on programming that brings the Spanish-speaking and English-speaking communities together.
  • Integrating local self-published creators’ content within the rest of the library’s electronic content.
  • Trying to find better metrics to measure success for ROI.
  • Developing community investors from FOL and active volunteers.
  • Giving up paper flyers/posters and moving to digital.
  • Moving social media effort to marketing department.
  • Looking at duplicate efforts and winnowing them down.
  • Learning how to code.
  • Hiring part-time and hiring non-librarians.
  • FRBR. RDA. Say no more.
  • Advocacy. Facetime with politicians and other sources of funding.
  • Would like to hear more from public libraries on ‘bring your own device’ initiatives that could be applied in the academic library setting.
  • Gamification of library resources and services.
  • Wikipedia – we should be creating more content there.
  • Better relationships with publishers.
  • The next level of life-long learning like Coursera and making the library a hub for it.
  • Downloadble database of music by local musicians.
  • Copyright, curations, folksonomies, and other issues of creating communities.
  • Podcasting.
  • Digitization projects that engage specific communities.
  • Keeping my head above water. Migrating to a more self-service model while maintaining a high level of service.
  • Moving to a new ILS. Proprietary or open source?
  • Reaching out to atypical non-users. Running ads in local for sale magazines.
  • Lock-in gaming nights.

IL 2012: Discovery Systems

Space Shuttle Discovery Landing At Washington DC
“Space Shuttle Discovery Landing At Washington DC” by Glyn Lowe

Speaker: Bob Fernekes

The Gang of Four: Google, Apple, Amazon, & Facebook

Google tends to acquire companies to grow the capabilities of it. We all know about Apple. Amazon sells more ebooks than print books now. Facebook is… yeah. That.

And then we jump to selecting a discovery service. You would do that in order to make the best use of the licensed content. This guy’s library did a soft launch in the past year of the discovery service they chose, and it’s had an impact on the instruction and tools (i.e. search boxes) he uses.

And I kind of lost track of what he was talking about, in part because he jumped from one thing to the next, without much of a transition or connection. I think there was something about usability studies after they implemented it, although they seemed to focus on more than just the discovery service.

Speaker: Alison Steinberg Gurganus

Why choose a discovery system? You probably already know. Students lack search skills, but they know how to search, so we need to give them something that will help them navigate the proprietary stuff we offer out on the web.

The problem with the discovery systems is that they are very proprietary. They don’t quite play fairly or nicely with competitor’s content yet.

Our users need to be able to evaluate, but they also need to find the stuff in the first place. A great discovery service should be self-explanatory, but we don’t have that yet.

We have students who understand Google, which connects them to all the information and media they want. We need something like that for our library resources.

When they were implementing the discovery tool, they wanted to make incremental changes to the website to direct users to it. They went from two columns, with the left column being text links to categories of library resources and services, to three columns, with the discover search box in the middle column.

When they were customizing the look of the discovery search results, they changed the titles of items to red (from blue). She notes that users tend to ignore the outside columns because that’s where Google puts advertisements, so they are looking at ways to make that information more visible.

I also get the impression that she doesn’t really understand how a discovery service works or what it’s supposed to do.

Speaker: Athena Hoeppner

Hypothesis: discovery includes sufficient content of high enough quality, with full text, and …. (didn’t type fast enough).

Looked at final papers from a PhD level course (34), specifically the methodology section and bibliography. Searched for each item in the discovery search as well as one general aggregator database and two subject-specific databases. The works cited were predominately articles, with a significant number of web sources that were not available through library resources. She was able to find more citations in the discovery search than in Google Scholar or any of the other library databases.

Clearly the discovery search was sufficient for finding the content they needed. Then they used a satisfaction survey of the same students that covered familiarity and frequency of use for the subject indexes, discovery search, and Google Scholar. Ultimately, it came down that the students were satisfied and happy with the subject indexes, and too few respondents to get a sense of satisfaction with the discovery search or Google Scholar.

Conclusions: Students are unfamiliar with the discovery system, but it could support their research needs. However, we don’t know if they can find the things they are looking for in it (search skills), nor do we know if they will ultimately be happy with it.

NASIG 2012: Mobile Websites and APP’s in Academic Libraries Harmony on a Small Scale

Speaker: Kathryn Johns-Masten, State University of New York Oswego

About half of American adults have smart phones now. Readers of e-books tend to read more frequently than others. They may not be reading more academic material, but they are out there reading.

SUNY Oswego hasn’t implemented a mobile site, but the library really wanted one, so they’ve created their own using the iWebKit from MIT.

Once they began the process of creating the site, they had many conversations about who they were targeting and what they expected to be used in a mobile setting. They were very selective about which resources were included, and considered how functional each tool was in that setting. They ended up with library hours, contact, mobile databases, catalog, ILL article retrieval (ILLiad), ask a librarian, Facebook, and Twitter (in that order).

When developing a mobile site, start small and enhance as you see the need. Test functionality (pull together users of all types of devices at the same time, because one fix might break another), review your usage statistics, and talk to your users. Tell your users that it’s there!

Tools for designing your mobile site: MobiReady, Squeezer, Google Mobile Site Builder, Springshare Mobile Site Builder, Boopsie, Zinadoo, iWebKit, etc.

Other things related to library mobile access… Foursquare! The library has a cheat sheet for answers to the things freshman are required to find on campus, so maybe they could use Foursquare to help with this. Tula Rosa Public Library used a screen capture of Google Maps to help users find their new location. QR codes could link to ask a librarian, book displays linked to reviews, social media, events, scavenger hunts, etc. Could use them to link sheet music to streaming recordings.

NASIG 2012: Why the Internet is More Attractive Than the Library

Speaker: Dr. Lynn Silipigni Connaway, OCLC

Students, particularly undergraduates, find Google search results to make more sense than library database search results. In the past, these kinds of users had to work around our services, but now we need to make our resources fit their workflow.

Connaway has tried to compare 12 different user behavior studies in the UK and the US to draw some broad conclusions, and this has informed her talk today.

Convenience is number one, and it changes. Context and situation are very important, and we need to remember that when asking questions about our users. Sometimes they just want the answer, not instruction on how to do the research.

Most people power browse these days: scan small chunks of information, view first few pages, no real reading. They combine this with squirreling — short, basic searches and saving the content for later use.

Students prefer keyword searches. This is supported by looking at the kinds of terms used in the search. Experts use broad terms to cover all possible indexing, novices use specific terms. So why do we keep trying to get them to use the “advance” search in our resources?

Students are confident with information discovery tools. They mainly use their common sense for determining the credibility of a site. If a site appears to have put some time into the presentation, then they are more likely to believe it.

Students are frustrated with navigating library websites, the inconvenience of communicating with librarians face to face, and they tend to associate libraries only with books, not with other information. They don’t recognize that the library is who is providing them with access to online content like JSTOR and the things they find in Google Scholar.

Students and faculty often don’t realize they can ask a question of a librarian in person because we look “busy” staring at our screens at the desk.

Researchers don’t understand copyright, or what they have signed away. They tend to be self-taught in discovery, picking up the same patterns as their graduate professors. Sometimes they rely on the students to tell them about newer ways of finding information.

Researchers get frustrated with the lack of access to electronic backfiles of journals, discovering non-English content, and unavailable content in search results (dead links, access limitation). Humanities researchers feel like there is a lack of good, specialized search engines for them (mostly for science). They get frustrated when they go to the library because of poor usability (i.e. signs) and a lack of integration between resources.

Access is more important than discovery. They want a seamless transition from discovery to access, without a bunch of authentication barriers.

We should be improving our OPACs. Take a look at Trove and Westerville Public Library. We need to think more like startups.

tl;dr – everything you’ve heard or read about what our users really do and really need, but we still haven’t addressed in the tools and services we offer to them