NASIG 2015 – Ain’t Nobody’s Business If I Do (Read Serials)

Speaker: Dorothea Salo, Faculty Associate, University of Wisconsin – Madison

Publishers and providers are collecting massive amounts of user data, and Salo is not happy about this. ALA code of ethics is not happy about this, either.

Why does privacy matter?

The gizmos that have ticked along for ages without being connected are now connected to the internet. It can be very handy, like smart thermostats, or a little too snoopy like the smart TV that listens in on your conversations. The FTC is scrutinizing the Internet of Things very closely, because it’s easy to cause some real harm with the data from these devices.

Thermostat data, for example, tells you a lot about when someone is at home or not, which can be useful for thieves, law enforcement, and marketers. And this is information that wasn’t available when the thermostat was offline.

Eresource use is being snooped on, too. Adobe is collecting reader behavior information from Adobe Digital Editions, even when it’s coming from library sources. They got caught because they were transmitting that information unencrypted, which they fixed, but they aren’t not doing it anymore.

Readers cannot trust content providers. Librarians cannot trust content providers. We have to assume you’re behaving like Adobe, until you prove otherwise. It’s easy, then, to lump eresources into the Internet of Things. Back in the day, journals and books weren’t online, but now they are ways to collect data on reader behavior.

Generally speaking, content providers have very little out there in a code of practice for reader privacy, including the relevant associations. Not even the open access publications and associations. Most journal privacy policies do not measure up to library standards, including those that are OA. 16 of the top 20 research journals let ad networks track readers.

There’s no conspiracy theory here. It’s mostly accidental. In the age of print, reader privacy wasn’t an issue. Readers could do whatever they wanted with the content. Content providers need to address this now that they are capable of collecting and using all sorts of data they couldn’t before.

NISO is working on a framework for this, and the NASIG community needs to be engaged.

The ALA code of ethics doesn’t say that you shouldn’t collect data when it’s convenient — there are no exceptions. Same goes for “improving services”.

The question, “Would we do this in a physical space with people around us?” is a useful gague of the creep factor. Physical library users and digital library users should have the same privacy rights.

It’s easy to feel helpless in this. It’s easy to give up and think no user cares about their privacy. Just because it’s easy and convenient to ignore privacy, that doesn’t make it right.

Libraries and content providers need to live up to Article III of the ALA Code of Ethics: “…protect each reader’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”

How do we do this? Understand the risks and mitigate them. Risks: personally identifying information (sometimes this is used as a smoke screen to hide what is being collected when this is not), long tail information (uncommon enough to identify individuals, even without PII), and behavior trails (highly specific time stamps, etc.). Libraries deal with this by tracking the stuff instead of the people. Libraries keep proxy server logs only long enough to identify use that violates TOS.

Determine who wants to know and why: data omnivores (NSA, Google, Facebook), data opportunists (academic researchers, usability wonks, assessment experts, readers who want to reuse their own data), and data paparazzi (doxxers, stalkers, politicians). Worry less about the opportunists and omnivores, worry a lot about the paparazzi.

What should we do or not do? No ostriching — heads out of the sand, please. The Library Freedom Project has lots of resources. Industry-level advocacy is needed — those who take the high road on privacy is afraid of being out-competed by those who don’t.

We’re not helpless. Don’t give up. License negotiation time is when we can ask the hard questions — use our Benjamins wisely. Assess mindfully, being aware of data leakage and compromised privacy.

Not even the greediest data omnivore, the most clueless data opportunist, or the most evil data paparazzi can abuse data that isn’t there. Don’t collect reader data unless there is a clear and reasonable reason to do it.

NASIG 2013: Knowledge and Dignity in the Era of Big Data

CC BY 2.0 2013-06-10
“Big Data” by JD Hancock

Speaker: Siva Vaidhyanathan

Don’t try to write a book about fast moving subjects.

He was trying to capture the nature of our relationship to Google. It provides us with a services that are easy to use, fairly dependable, and well designed. However, that level of success can breed hubris. He was interested in how this drives the company to its audacious goals.

It strikes him that what Google claims to be doing is what librarians have been doing for hundreds of years already. He found himself turning to the core practices of librarians as a guideline for assessing Google.

Why is Google interested in so much stuff? What is the payoff to organizing the world’s information and making it accessible?

Big data is not a phrase that they use much, but the notion is there. More and faster equals better. Google is in the prediction/advertising business. The Google books project is their attempt to reverse engineer the sentence. Knowing how sentences work, they can simulate how to interpret and create sentences, which would be a simulation of artificial intelligence.

The NSA’s deals that give them a backdoor to our data services creates data insecurity, because if they can get in, so can the bad guys. Google keeps data about us (and has to turn it over when asked) because it benefits their business model, unlike libraries who don’t keep patron records in order to protect their privacy.

Big data means more than a lot of data. It means that we have so many instruments to gather data, cheap/ubiquitous cameras and microphones, GPS devices that we carry with us, credit card records, and more. All of these ways of creating feed into huge servers that can store the data with powerful algorithms that can analyze it. Despite all of this, there is no policy surrounding this, nor conversations about best ways to manage this in light of the impact on personal privacy. There is no incentive to curb big data activities.

Scientists are generally trained to understand that correlation is not causation. We seem to be happy enough to draw pictures with correlation and move on to the next one. With big data, it is far too easy to stop at correlation. This is a potentially dangerous way of understanding human phenomenon. We are autonomous people.

The panopticon was supposed to keep prisoners from misbehaving because they assumed they were always being watched. Foucault described the modern state in the 1970s as the panopticon. However, at this point, it doesn’t quite match. We have a cryptopticon, because we aren’t allowed to know when we are being watched. It wants us to be on our worst behavior. How can we inject transparency and objectivism into this cryptopticon?

Those who can manipulate the system will, but those who don’t know how or that it is happening will be negatively impacted. If bad credit can get you on the no-fly list, what else may be happening to people who make poor choices in one aspect of their lives that they don’t know will impact other aspects? There is no longer anonymity in our stupidity. Everything we do, or nearly so, is online. Mistakes of teenagers will have an impact on their adult lives in ways we’ve never experienced before. Our inability to forget renders us incapable of looking at things in context.

Mo Data, Mo Problems

rfid blocking kit

Ha! The creative geniuses at ThinkGeek have come up with an RFID Blocking Kit t-shirt. The best part? It’s free.

I’m still giggling over the iZilla.

Take the iZilla with you anywhere. It’s like having an entire home entertainment system in a handy 30 pound white briefcase. The iZilla can be powered by a standard 120VAC wall outlet, or runs off 16 D size batteries (not included).

portable information technology

Yesterday, Open Stacks author Greg Schwartz wrote about smart tags being used for books so that wireless phone users could point their phone at the book and call up information from the OPAC or websites like, and that got me thinking. My library Dean came back from ALA fired up about a new technology in barcoding called RFID (Radio Frequency Identification). Right now it’s a hot topic among consumer advocacy and privacy groups, but the technology has been slowly creeping into libraries through technologies like self-checkout systems and collection inventories.

Personally, I’m divided on the issue. I think that libraries will likely use this technology responsibly by doing things like turning off the tags after they have been legitimately checked out so that they will not be able to track where the book is physically (except for the information in the patron record, of course). I do have some concerns regarding commercial use of the tags. I understand the security issues, but if the tags aren’t automatically turned off when the item is purchased, much like when the ink tag is removed from an item of clothing, then it does pose some questions about consumer privacy.

As for Mr. Schwartz’s wish for smart tags in books that talk to wireless phones, I expect that it shouldn’t be long before someone develops a technology that will facilitate the communication between RFID and smart tags.

dixie chicks on the next geraldo!

It’s so beautiful outside today that I wish my campus had a campus-wide wireless network. That way, I could borrow a laptop and work on the lawn. Ahh… one can dream…

The Specious Report has written a satire of Natalie Maines’ apology. I think it is much more appropriate.

Geraldo Rivera has “volunteered” to leave Iraq after broadcasting the location of the Army troops he was quasi-embedded with, as well as their possible future movements. I thought that the Fox News Channel was the breeding ground for conservative war hawks. I had no idea that they were actually working for Saddam!

Ever since the Patriot Act was passed in Congress, librarians have been discussing what to do about patron privacy. Booksellers have also been concerned, but their situation is somewhat more complex than libraries, since they have a history of using their customer histories to provide more customized service. One bookstore owner in Washington State has decided to not follow many libraries’ leads and is retaining his customer records in full. He briefly explains why he has made this decision, despite privacy concerns surrounding the Patriot Act.