NASIG 2012: Practical Applications of Do-it-Yourself Citation Analysis

Speaker: Steve Black, College of Saint Rose

Citation analysis is the study of patterns and frequencies of citations. You might want to do this because it is an objective and quantitative way of looking at a journal’s impact (i.e. how often it is cited). It can also be used to determine the impact of authors, institutions, nations, and the “best journals for ____.”

There are many different critiques of relying on impact factors, and the literature on this is vast. There could be problems in the citations themselves and how they make it into the databases. Some citations become famous for being famous. There are many ways of gaming the system. But the main one is the conceptual difference between impact, quality, and importance. And finally, global versus local impact. Because of all this, it is important to remember that impact factor is one of many considerations for collection development, tenure & promotion, and paper submissions.

Doing it yourself allows you to taylor it to specific needs not covered elsewhere. It can be quick & dirty, exhaustive, or something in between. And, of course, does not incur the same kind of costs as subscriptions to Scopus or Web of Knowledge.

First, select a target population (journals in a sub-discipline, researchers at your institution, or a single faculty member). Then select a sample that represents the target population. Compile the works and sort/count the works cited.

Google Scholar is good for identifying articles on a topic, but not so much on authority control and streamlining citation formats. Zotero is a great citation management tool, but it doesn’t work so well for citation analysis because of how complicated it is to extract data into Excel.

Black did this to look at frequently cited journals in forensic psychology. He used WorldCat to identify the most-held journals in the field, and gathered the citations from recently published volumes. He then used PSYCinfo in EBSCOhost to look up the citations and export them to RefWorks, creating folders for each issue’s works cited.

Then exported these to Excel, sorted by title, and corrected the discrepancies in title formats. Once the data was washed, he manually counted the number of citations for each journal by selecting the cells with the title name and using the Count total in the lower information bar of Excel. This information went into a new spreadsheet. (I asked why not use a pivot table, and he didn’t know how to use one, and wasn’t sure if it would account for the title variations he may not have caught.) Generally, the groupings of citations fall within the Bradford distribution.

There are two ways to measure the reliability of the rankings you discover. On a macro level, you can look at how well the ranked lists match from month of publication to month of publication. You can test the consistency of Spearman’s rho rank correlation coefficient. And then Black went off into statistical stuff that doesn’t make sense to me just sitting here. One issue of a journal isn’t enough to determine the rankings of the journals in the field, but several volumes of 3-5 journals would do it.

On the micro level, you use more statistical methods (coefficient of variation). A large coefficient of variation indicates how much the ranking of a journal is bouncing around.

To have a useful ranked list, you need to have at least 10,000 citations, and closer to 20,000 is better. Even with that many, different samples will yield different ranks, particularly further down the list. So, a journal’s rank must be taken as an approximation of it’s true rank, and is probably a snapshot in time.

With all the positives of doing this, keep in mind the weaknesses. It is very time consuming to do. You need a lot of citations, and even that isn’t definitive. Works cited may not be readily available. It’s recreating the (very expensive) wheel, and may be better to use Scopus or Web of Knowledge if you have them.

For collection development, you can use this to assess the impact of specialized journals. Don’t be surprised to find surprises, and don’t judge new titles with this criteria.

Practical applications include identifying top journals on a topic, supporting a new major, figuring out what a department really uses, and potentially publishing for tenure purposes.

css.php