NASIG 2013: Knowledge and Dignity in the Era of Big Data

CC BY 2.0 2013-06-10
“Big Data” by JD Hancock

Speaker: Siva Vaidhyanathan

Don’t try to write a book about fast moving subjects.

He was trying to capture the nature of our relationship to Google. It provides us with a services that are easy to use, fairly dependable, and well designed. However, that level of success can breed hubris. He was interested in how this drives the company to its audacious goals.

It strikes him that what Google claims to be doing is what librarians have been doing for hundreds of years already. He found himself turning to the core practices of librarians as a guideline for assessing Google.

Why is Google interested in so much stuff? What is the payoff to organizing the world’s information and making it accessible?

Big data is not a phrase that they use much, but the notion is there. More and faster equals better. Google is in the prediction/advertising business. The Google books project is their attempt to reverse engineer the sentence. Knowing how sentences work, they can simulate how to interpret and create sentences, which would be a simulation of artificial intelligence.

The NSA’s deals that give them a backdoor to our data services creates data insecurity, because if they can get in, so can the bad guys. Google keeps data about us (and has to turn it over when asked) because it benefits their business model, unlike libraries who don’t keep patron records in order to protect their privacy.

Big data means more than a lot of data. It means that we have so many instruments to gather data, cheap/ubiquitous cameras and microphones, GPS devices that we carry with us, credit card records, and more. All of these ways of creating feed into huge servers that can store the data with powerful algorithms that can analyze it. Despite all of this, there is no policy surrounding this, nor conversations about best ways to manage this in light of the impact on personal privacy. There is no incentive to curb big data activities.

Scientists are generally trained to understand that correlation is not causation. We seem to be happy enough to draw pictures with correlation and move on to the next one. With big data, it is far too easy to stop at correlation. This is a potentially dangerous way of understanding human phenomenon. We are autonomous people.

The panopticon was supposed to keep prisoners from misbehaving because they assumed they were always being watched. Foucault described the modern state in the 1970s as the panopticon. However, at this point, it doesn’t quite match. We have a cryptopticon, because we aren’t allowed to know when we are being watched. It wants us to be on our worst behavior. How can we inject transparency and objectivism into this cryptopticon?

Those who can manipulate the system will, but those who don’t know how or that it is happening will be negatively impacted. If bad credit can get you on the no-fly list, what else may be happening to people who make poor choices in one aspect of their lives that they don’t know will impact other aspects? There is no longer anonymity in our stupidity. Everything we do, or nearly so, is online. Mistakes of teenagers will have an impact on their adult lives in ways we’ve never experienced before. Our inability to forget renders us incapable of looking at things in context.

Mo Data, Mo Problems

IL 2012: Sensible Library Website Development

Jakob Lodwick
“Jakob Lodwick” by Zach Klein

Speaker: Amanda Etches

Asked some folks on Twitter why their library has a website. A few of the responses: to link to online resources, to allow access to the catalog, to support research needs, to provide access to resources & services, to teach, to help, to provide access to account function, to post events, to post policies & hours, it’s the primary way our patrons interact with us, and as a two-way communication tool between the library and the community they serve. Audience member noted that marketing your library is missing.

While we are all unique little snowflakes, we aren’t all that unique in our motivations for having a library website. So, how can we learn from each other?

Website planning needs to have a clear understanding of scope. Since most of us have a website, this talk will focus more on redesign than from building from scratch. Most people tend to skip the scoping step when doing a redesign because we assume that it will cover the same stuff we already have.

Sadly, most libraries are like a big, messy junk drawer of stuff. We tend to take a “just in case” approach to designing sites. Less is not more, less is actually less, and that’s a good thing. Consider the signal to noise ratio of your website. What users don’t need is too much noise drowning out the signal. Pay attention to how much you are putting on the site that meets your needs rather than your user’s needs. It’s better for half of your website to be amazing than all of it to be bland.

Think about your website like a pyramid, where the bottom half is the basics, followed by destination information, then participatory components, and finally a community portal. Think of it like Maslow’s hierarchy of needs — the basic stuff has to be good or you can’t get to the participatory level.

Etches and some colleagues created a website experiment that is an entire library site on one page called the One-Pager. Freehold Public Library has taken this and ran with it, if you want to see it working in the real world.

Designing for mobility requires you to pare back to what you consider to be essential functionality, and a great way to help scope your website. If you wouldn’t put it on your mobile version, think about why you should put it on your desktop website. Recommend the book Mobile First as an inspiration for scope.

How do you determine critical tasks of a website? As your users. A simple one-page survey, interviews, focus groups, and heat maps. Asking staff is the least useful way to do it.

Web users don’t read content, they skim/scan it. People don’t want to read your website; they want to find information on it. When writing copy for your website, pare it down, and then pare it down again. Your website should be your FAQ, not your junk drawer. Think about your website as bite-sized chunks of information, not documentation. Adopt the inverted pyramid style for writing copy. If you have a lot of text, bold key concepts to catch skimming eyes. Eye-catching headers work well in conjunction with the inverted pyramid and bolded key concepts.

Treat your website like a conversation between you and your users/audience. Pages not be written by passive voiced writers. Write in the active voice, all of the time, every time. Library = we; User = you

It is not easy to redo the navigation on a website. Bad navigation makes you think, good navigation is virtually invisible. Navigation needs to serve the purposes of telling the user: site name, page name, where they are, whey they can go, and how they can search. Salt Lake City Public Library and Vancouver Public Library do this very well, if you want some real-world examples.

It’s very important to match navigation labels to page names. Also keep in mind that your navigation is not your org chart, so don’t design navigation along that. Do not, ever (and I’m surprised we still have to talk about this 15 years after I learned it), use “click here”. Links should be descriptive.

Why test websites at all? A lack of information is at the root of all bad design decisions. Usability testing runs the gamut from short & easy to long & hard. Watch people use your site. It can take just five minutes to do that.

We are not our patrons, so don’t test librarians and library staff. They are also not your primary user group and not the ones you need to worry about the most. Five testers are usually enough for any given test, more than that and you’ll get repetition. No test is too small; don’t test more than three things at once. Make iterative changes as you go along. Test early and often. The best websites do iterative changes over time based on constant testing.

Have a script when you are testing. You want to ensure that all testers receive the same instructions and makes it a little more comfortable for the test giver. Provide testers with an outline of what they will be doing, and also give them a paper list of tasks they will be doing. Remind them that they aren’t the ones being tested, the website is. Don’t tell them where to go and what to do (i.e. “search a library database for an article on x topic”).

From Q&A section:
All of your navigation items should be in one place and consistent across the site.

What do you do when use and usability says that you should remove a page a librarian is keen to keep? One suggestion is to put it in a LibGuide. Then LibGuides become the junk drawer. One way to keep that from happening is to standardizing the look and feel of LibGuides.

For policies, you could put a summary on the website and then link to the full document.

IL 2012: Discovery Systems

Space Shuttle Discovery Landing At Washington DC
“Space Shuttle Discovery Landing At Washington DC” by Glyn Lowe

Speaker: Bob Fernekes

The Gang of Four: Google, Apple, Amazon, & Facebook

Google tends to acquire companies to grow the capabilities of it. We all know about Apple. Amazon sells more ebooks than print books now. Facebook is… yeah. That.

And then we jump to selecting a discovery service. You would do that in order to make the best use of the licensed content. This guy’s library did a soft launch in the past year of the discovery service they chose, and it’s had an impact on the instruction and tools (i.e. search boxes) he uses.

And I kind of lost track of what he was talking about, in part because he jumped from one thing to the next, without much of a transition or connection. I think there was something about usability studies after they implemented it, although they seemed to focus on more than just the discovery service.

Speaker: Alison Steinberg Gurganus

Why choose a discovery system? You probably already know. Students lack search skills, but they know how to search, so we need to give them something that will help them navigate the proprietary stuff we offer out on the web.

The problem with the discovery systems is that they are very proprietary. They don’t quite play fairly or nicely with competitor’s content yet.

Our users need to be able to evaluate, but they also need to find the stuff in the first place. A great discovery service should be self-explanatory, but we don’t have that yet.

We have students who understand Google, which connects them to all the information and media they want. We need something like that for our library resources.

When they were implementing the discovery tool, they wanted to make incremental changes to the website to direct users to it. They went from two columns, with the left column being text links to categories of library resources and services, to three columns, with the discover search box in the middle column.

When they were customizing the look of the discovery search results, they changed the titles of items to red (from blue). She notes that users tend to ignore the outside columns because that’s where Google puts advertisements, so they are looking at ways to make that information more visible.

I also get the impression that she doesn’t really understand how a discovery service works or what it’s supposed to do.

Speaker: Athena Hoeppner

Hypothesis: discovery includes sufficient content of high enough quality, with full text, and …. (didn’t type fast enough).

Looked at final papers from a PhD level course (34), specifically the methodology section and bibliography. Searched for each item in the discovery search as well as one general aggregator database and two subject-specific databases. The works cited were predominately articles, with a significant number of web sources that were not available through library resources. She was able to find more citations in the discovery search than in Google Scholar or any of the other library databases.

Clearly the discovery search was sufficient for finding the content they needed. Then they used a satisfaction survey of the same students that covered familiarity and frequency of use for the subject indexes, discovery search, and Google Scholar. Ultimately, it came down that the students were satisfied and happy with the subject indexes, and too few respondents to get a sense of satisfaction with the discovery search or Google Scholar.

Conclusions: Students are unfamiliar with the discovery system, but it could support their research needs. However, we don’t know if they can find the things they are looking for in it (search skills), nor do we know if they will ultimately be happy with it.

NASIG 2012: Why the Internet is More Attractive Than the Library

Speaker: Dr. Lynn Silipigni Connaway, OCLC

Students, particularly undergraduates, find Google search results to make more sense than library database search results. In the past, these kinds of users had to work around our services, but now we need to make our resources fit their workflow.

Connaway has tried to compare 12 different user behavior studies in the UK and the US to draw some broad conclusions, and this has informed her talk today.

Convenience is number one, and it changes. Context and situation are very important, and we need to remember that when asking questions about our users. Sometimes they just want the answer, not instruction on how to do the research.

Most people power browse these days: scan small chunks of information, view first few pages, no real reading. They combine this with squirreling — short, basic searches and saving the content for later use.

Students prefer keyword searches. This is supported by looking at the kinds of terms used in the search. Experts use broad terms to cover all possible indexing, novices use specific terms. So why do we keep trying to get them to use the “advance” search in our resources?

Students are confident with information discovery tools. They mainly use their common sense for determining the credibility of a site. If a site appears to have put some time into the presentation, then they are more likely to believe it.

Students are frustrated with navigating library websites, the inconvenience of communicating with librarians face to face, and they tend to associate libraries only with books, not with other information. They don’t recognize that the library is who is providing them with access to online content like JSTOR and the things they find in Google Scholar.

Students and faculty often don’t realize they can ask a question of a librarian in person because we look “busy” staring at our screens at the desk.

Researchers don’t understand copyright, or what they have signed away. They tend to be self-taught in discovery, picking up the same patterns as their graduate professors. Sometimes they rely on the students to tell them about newer ways of finding information.

Researchers get frustrated with the lack of access to electronic backfiles of journals, discovering non-English content, and unavailable content in search results (dead links, access limitation). Humanities researchers feel like there is a lack of good, specialized search engines for them (mostly for science). They get frustrated when they go to the library because of poor usability (i.e. signs) and a lack of integration between resources.

Access is more important than discovery. They want a seamless transition from discovery to access, without a bunch of authentication barriers.

We should be improving our OPACs. Take a look at Trove and Westerville Public Library. We need to think more like startups.

tl;dr – everything you’ve heard or read about what our users really do and really need, but we still haven’t addressed in the tools and services we offer to them

NASIG 2009: Ambient Findability

Libraries, Serials, and the Internet of Things

Presenter: Peter Morville

He’s a librarian that fell in love with the web and moved into working with information architecture. When he first wrote the book Information Architecture, he and his co-author didn’t include a definition of information architecture. With the second edition, they had four definitions: the structural design of shared information environments; the combination of organization, labeling, search, and navigation systems in webs sites and intranet; the art and science of shaping information products and experiences to support usability and finadability; an emerging discipline and community of practice focused on bringing principles of designing and architecture to the digital landscape.

[at this point, my computer crashed, losing all the lovely notes I had taken so far]

Information systems need to use a combination of categories (paying attention to audience and taxonomy), in-text linking, and alphabetical indexes in order to make information findable. We need to start thinking about the information systems of the future. If we examine the trends through findability, we might have a different perspective. What are all the different ways someone might find ____? How do we describe it to make it more findable?

We are drowning in information. We are suffering from information anxiety. Nobel Laureate Economist Herbert Simon said, “A wealth of information creates a poverty of attention.”

Ambient devices are alternate interfaces that bring information to our attention, and Moreville thinks this is a direction that our information systems are moving towards. What can we now do when our devices know where we are? Now that we can do it, how do we want to use it, and in what contexts?

What are our high-value objects, and is it important to make them more findable? RFID can be used to track important but easily hidden physical items, such as wheelchairs in a hospital. What else can we do with it besides inventory books?

In a world where for every object there are thousands of similar objects, how do we describe the uniqueness of each one? Who’s going to do it? Not Microsoft, and not Donald Rumsfeld and his unknown unknown. It’s librarians, of course. Nowadays, metadata is everywhere, turning everyone who creates it into librarians and information architects of sorts.

One of the challenges we have is determine what aspects of our information systems can evolve quickly and what aspects need more time.

In five to ten years from now, we’ll still be starting by entering a keyword or two into a box and hitting “go.” This model is ubiquitous and it works because it acknowledges human psychology of just wanting to get started. Search is not just about the software. It’s a complex, adaptive system that requires us to understand our users so that they not only get started, but also know how to take the next step once they get there.

Some example of best and worse practices for search are on his Flickr. Some user-suggested improvements to search are auto-compete search terms, suggested links or best bets, and for libraries, federated search helps users know where to begin. Faceted navigation goes hand in hand with federated search, which allows users to formulate what in the past would have been very sophisticated Boolean queries. It also helps them to understand the information space they are in by presenting a visual representation of the subset of information.

Morville referenced last year’s presentation by Mike Kuniavsky regarding ubiquitous computing, and he hoped that his presentation has complemented what Kuniavsky had to say.

Libraries are more than just warehouses of materials — they are cathedrals of information that inspire us.

PDF of his slides