NASIG 2013: Knowledge and Dignity in the Era of Big Data

CC BY 2.0 2013-06-10
“Big Data” by JD Hancock

Speaker: Siva Vaidhyanathan

Don’t try to write a book about fast moving subjects.

He was trying to capture the nature of our relationship to Google. It provides us with a services that are easy to use, fairly dependable, and well designed. However, that level of success can breed hubris. He was interested in how this drives the company to its audacious goals.

It strikes him that what Google claims to be doing is what librarians have been doing for hundreds of years already. He found himself turning to the core practices of librarians as a guideline for assessing Google.

Why is Google interested in so much stuff? What is the payoff to organizing the world’s information and making it accessible?

Big data is not a phrase that they use much, but the notion is there. More and faster equals better. Google is in the prediction/advertising business. The Google books project is their attempt to reverse engineer the sentence. Knowing how sentences work, they can simulate how to interpret and create sentences, which would be a simulation of artificial intelligence.

The NSA’s deals that give them a backdoor to our data services creates data insecurity, because if they can get in, so can the bad guys. Google keeps data about us (and has to turn it over when asked) because it benefits their business model, unlike libraries who don’t keep patron records in order to protect their privacy.

Big data means more than a lot of data. It means that we have so many instruments to gather data, cheap/ubiquitous cameras and microphones, GPS devices that we carry with us, credit card records, and more. All of these ways of creating feed into huge servers that can store the data with powerful algorithms that can analyze it. Despite all of this, there is no policy surrounding this, nor conversations about best ways to manage this in light of the impact on personal privacy. There is no incentive to curb big data activities.

Scientists are generally trained to understand that correlation is not causation. We seem to be happy enough to draw pictures with correlation and move on to the next one. With big data, it is far too easy to stop at correlation. This is a potentially dangerous way of understanding human phenomenon. We are autonomous people.

The panopticon was supposed to keep prisoners from misbehaving because they assumed they were always being watched. Foucault described the modern state in the 1970s as the panopticon. However, at this point, it doesn’t quite match. We have a cryptopticon, because we aren’t allowed to know when we are being watched. It wants us to be on our worst behavior. How can we inject transparency and objectivism into this cryptopticon?

Those who can manipulate the system will, but those who don’t know how or that it is happening will be negatively impacted. If bad credit can get you on the no-fly list, what else may be happening to people who make poor choices in one aspect of their lives that they don’t know will impact other aspects? There is no longer anonymity in our stupidity. Everything we do, or nearly so, is online. Mistakes of teenagers will have an impact on their adult lives in ways we’ve never experienced before. Our inability to forget renders us incapable of looking at things in context.

Mo Data, Mo Problems

ER&L 2013: E-Resources, E-Realities

“Tools” by Josep Ma. Rosell

Speakers: Jennifer Bazeley (Miami University) & Nancy Beals (Wayne State University)

Despite all the research on what we need/want, but no one is building commercial products that meet all our needs and addresses the impediments of cost and dwindling staff.

Beals says that the ERM is not used for workflow, so they needed other tools, with a priority on project management and Excel proficiency. They use an internal listserv, UKSG Transfer, Trello (project management software), and a blog, to keep track of changes in eresources.

Other tools for professional productivity and collaboration: iPads with Remember the Milk or Evernote, Google spreadsheets (project portfolio management organization-wide), and LibGuides.

Bazeley stepped into the role of organizing eresources information in 2009, with no existing tool or hub, which gave her room to experiment. For documentation, they use PBWiki (good for version tracking, particularly to correct errors) with an embedded departmental Google calender. For communication, they use LibGuides for internal documents, and you can embed RSS, Google Docs, Yahoo Pipes aggregating RSS feeds, Google forms for eresource access issues, links to Google spreadsheets with usage data, etc.. For login information, they use KeePass Password Safe. Rather than claiming in the ILS, they’ve moved to using the claim checker tool from the subscription agent.

Tools covered:

  • Google Calendar
  • Google Docs (includes forms & spreadsheets)
  • PBWiki
  • LibGuides
  • Yahoo Pipes
  • WordPress
  • KeePass Password Safe
  • PDF Creator
  • EBSCOnet

Others listed:

  • Blogger (blog software)
  • Mendeley (ref manager)
  • Vimeo (videos)
  • Jing (screenshot/screencast)
  • GIMP (image editor)
  • MediaWiki (Wiki software)
  • LastPass (password manager)
  • OpenOffice (software suite)
  • PDF Creator (PDF manipulation)
  • Slideshare (presentation manager)
  • Filezilla (ftp software)
  • Zoho Creator (database software)
  • Dropbox (cloud storage)
  • Github (software management)
  • Subscription agent software (SwetsWise, EBSCOnet)
  • Microsoft Excel / Access
  • Course Management Software (Moodle, Sakai, Blackboard)
  • Open Source ERMS: ERMes (University of Wisconsin-La Crosse) & CORAL (University of Notre Dame)

food blogging & making things so labor intensive I don’t do them

derby pie
derby pie

I started a food blog on Tumblr last January. Here’s the about statement:

I started this project because after a year of taking photos of myself every day, I wanted to document something else. Over the summer and fall, I had developed a routine of trying new recipes on the weekends and some weeknights. This blog is where I share photos of the results, talk about what went right or wrong, and link to the recipes.

And sometime in May/June, I stopped. I got busy. I remembered to take some pictures, but they sat on my desktop waiting to be blogged for so long that I felt guilty and overwhelmed, so I eventually deleted them.

It wasn’t like it would take all that much time to write up something. And add a link. And format it the same as the previous posts. But it seemed like a big deal at the time.

Also, I stopped cooking/baking as much in the summer.

I have this tendency to make things that should be simple and routine into complex, detailed processes that become burdensome. Is this just some freak aspect of my desire for control and order, or is it simply human nature?

my presentation for Internet Librarian 2012

Apologies for the delay. It took longer than I expected to have the file and a stable internet connection at the same time. You’ll find the notes on the SlideShare page.

IL 2010: Adding Value to Your Community

speaker: Patricia Martin

[I took notes on paper because my netbook power cord was in my checked bag that SFO briefly lost on the way here. This is an edited transfer to electronic.]

She told a story about how a tree in her yard sprang up and quickly produced fruit, due in part to the fertilization that came from some bats living in her garage. The point being is that libraries are sitting on hidden assets (i.e. bat shit), but we haven’t packaged it in a way our community will recognize and value it (i.e. bat guano fertilizer).

She thinks that the current conditions indicate we are on the cusp of a renaissance generation that will lead to an explosion of creativity. Every advanced civilization gets to a point where there is so much progress made that traditions become less relevant and are shed. We need to keep libraries, or at least their role in society/education, relevant or they will be lost.

Martin says that the indicators of a renaissance are death (recession), a facilitating medium (internet), and an age of enlightenment (aided by the internet). We are seeing massive creativity online, from blog content larger than the volumes in the Library of Congress to Facebook to the increase in epublications over their print counterparts.

Capitalism relies on conformity, but conformity won’t give us the creativity we need. Brands/companies who are succeeding are those who provide a sense of belonging/community for their users, who empower creativity among them, and who manage the human interface.

The old ways have the brand at the center, but the new way is to have the user at the center. This sounds easy, until you have to live it. When the user is at the center, they want to build a community/tribe together, which creates sticker brands.

Jonathan Harris wants us to move forward towards creating a vibrant culture online that’s not about celebrity tweets. He is studying the things that people yearn for and creating a human interface to explore it. It is projected that 80% of data generated will come from social networks – how will we make sense of it all? Why would the RenGen (renaissance generation) still use libraries if the traditional book is our brand? We need a new story about the future where libraries are present, in whatever form they become.

A president of a cloud computing company is quoted by Martin as saying that in the future, screens will be everywhere. The return on transaction (faster) will replace the return on investment. He saw the cloud storage demand grow 500 times in 2009, and expects that rate will only continue into the future as we generate more and more data.

Story is the new killer app – the ultimate human interface. The new story of the future will be built around preconition.

Libraries can create value by leaving the desk and going into the community to provide neutral information to meet the needs of the community. We add value by putting users at the center, letting them collaborate on the rules, and curating the human interface.