ER&L 2012: Knockdown/Dragout Webscale Discovery Service vs. Niche Databases — Data-Driven Evaluation Methods

tug-of-war
photo by TheGiantVermin

Speaker: Anne Prestamo

You will not hear the magic rational that will allow you to cancel all your A&I databases. The last three years of analysis at her institution has resulted in only two cancelations.

Background: she was a science librarian before becoming an administrator, and has a great appreciation for A&I searching.

Scenario: a subject-specific database with low use had been accessed on a per-search basis, but going forward it would be sole-sourced and subscription based. Given that, their cost per search was going to increase significantly. They wanted to know if Summon would provide a significant enough overlap to replace the database.

Arguments: it’s key to the discipline, specialized search functionality, unique indexing, etc… but there’s no data to support how these unique features are being used. Subject searches in the catalog were only 5% of what was being done, and most of them came from staff computers. So, are our users actually using the controlled vocabularies of these specialized databases. Finally, librarians think they just need to promote these more, but sadly, that ship’s already sailed.

Beyond usage data, you can also look at overlap with your discovery service, and also identify unique titles. For those, you’ll need to consider local holdings, ILL data, impact factors, language, format, and publication history.

Once they did all of that, they found that 92% of the titles were indexed in their discovery service. The depth of the backfile may be an issue, depending on the subject area. Also, you may need to look at the level of indexing (cover to cover vs. selective). In the end, they found that 8% of the titles not included, they owned most of them in print and they were rather old. 15% of the 8% had impact factors, which may or may not be relevant, but it is something to consider. And, most of the titles were non-English. They also found that there were no ILL requests for the non-owned unique titles, and less than half were scholarly and currently being published.

CiL 2008: Speed Searching

Speaker: Greg Notess

His talk summarizes points from his Computers in Libraries articles on the same topic, so go find them if you want more details than what I provide.

It takes time to find the right query/database, and to determine the best terminology to use in order to find what you are seeking. Keystroke economy makes searching faster, like the old OCLC FirstSearch 3-2-2-1 searching. Web searching relevancy is optimized by using only a few unique words rather than long queries. Do spell checking through a web search and then take that back into a reference database. Search suggestions on major search engines help with the spelling problem, and the suggestions are ranked based on the frequency with which they are searched, but they require you to type slowly to use them effectively and increase your search speed. Copy and paste can be enhanced through browser plugins or bookmarklets that allow for searching based on selected text.

The search terms matter. Depending on the source, average query length searches using unique terms perform better over common search terms or long queries. Use multiple databases because it’s fun, you’re a librarian, and there is a lack of overlap between data sources.

Search switching is not good for quick look-ups, but it can be helpful with hard to find answers that require in-depth query. We have a sense that federated searching should be able to do this, but some resources are better searched in their native interfaces in order to find relevant sources. There are several sites that make it easy to switch between web search engines using the same query, including a nifty site that will allow you to easily switch between the various satellite mapping sources for any location you choose.

I must install the Customize Google Firefox plugin. (It’s also available for IE7, but why would you want to use IE7, anyway?)

CiL 2008: What’s New With Federated Search

Speakers: Frank Cervone & Jeff Wisniewski

Cervone gave a brief over-view of federated searching, with Wisniewski giving a demonstration of how it works in the real world (aka University of Pittsburgh library) using WebFeat. UofP library has a basic search front and center on their home page, and then a more advanced searching option under Find Articles. They don’t have a Database A-Z list because users either don’t know what database means in this context or can’t pick from the hundreds available.

Cervone demonstrated the trends in using meta search, which seems to go up and down, but over-all is going up. The cyclical aspect due to quarter terms was fascinating to see — more dramatic than what one might find with semester terms. Searches go up towards mid-terms and finals, then drop back down afterwards.

According to a College & Research Libraries article from November 2007, federated search results were not much different from native database searches. It also found that faculty rated results of federated searching much higher than librarians, which begs the question, “Who are we trying to satisfy — faculty/students or librarians.”

Part of why librarians are still unconvinced is because vendors are shooting themselves in the foot in the way they try to sell their products. Yes, federated search tools cannot search all possible databases, but our users are only concerned that they search the relevant databases that they need. De-duplication is virtually impossible and depends on the quality of the source data. There are other ways that vendors promote their products in ways that can be refuted, but the presenters didn’t spend much time on them.

The relationships between products and vendors is incestuous, and the options for federated searching are decreasing. There are a few open source options, though: LibraryFind, dbWiz, Masterkey, and Open Translators (provides connectors to databases, but you have to create the interface). Part of why open source options are being developed is because commercial vendors aren’t responding quickly to library needs.

LibraryFind has a two-click find workflow, making it quicker to get to the full-text. It also can index local collections, which would be handy for libraries who are going local.

dbWiz is a part of a larger ERM tool. It has an older, clunkier interface than LibraryFind. It doesn’t merge the results.

Masterkey can search 100 databases at a time, processing and returning hits at the rate of 2000 records per second, de-duped (as much as it can) and ranked by relevance. It can also do faceted browsing by library-defined elements. The interface can be as simple or complicated as you want it to be.

Federated searching as a stand-alone product is becoming passe as new products for interfacing with the OPAC are being developed, which can incorporate other library databases. vufind, WorldCat local, Encore, Primo, and Aquabrowser are just a few of the tools available. NextGen library interfaces aim to bring all library content together. However, they don’t integrate article-level information with the items in your catalog and local collections very well.

Side note: Microsoft Enterprise Search is doing a bit more than Google in integrating a wide range of information sources.

Trends: Choices from vendors is rapidly shrinking. Some progress in standards implementation. Visual search (like Grokker) is increasingly being used. Some movement to more holistic content discovery. Commercial products are becoming more affordable, making them available to institutions of all sizes of budgets.

Federated Search Blog for vendor-neutral info, if you’re interested.