FROM RETRIEVAL TO ANALYSIS: HOW THE ARTICLE SUMMARY INTERFACE CAN GET YOU THERE

This article was written when Thomson Reuters was known as the Institute for Scientific Information (ISI)

Imagine that you have just completed a search and obtained a set of bibliographic information that satisfies your needs precisely. You may have collected a handful of articles—or hundreds. Now, imagine how you will deal with these papers. In many cases, information retrieval is just the first step in a search. The next step—especially if you are facing a large collection of papers—is analysis.

Because information users often need only a specific subset of Thomson Reuters data, Thomson Reuters Research Group accommodates these requests by creating customized, relationally structured datasets to suit specific needs. In 1994, the Research Group added an interface—the Article Summary Interface—to these datasets to help users analyze the data easily. This bibliometric interface is a Windows®-based tool that implements both basic and complex queries.

The Article Summary Interface features a different kind of access and different functionalities than those offered with other Thomson Reuters products. The interface is designed to work with the relational database, allowing you to manipulate any of the papers' features (e.g., author names, citation counts, author addresses). Specific relational database software packages such as Paradox® or Access® are not needed because their capabilities are built into the software. The interface also allows for tabular and graphical presentations of the statistics.


Types of Analyses
The interface is based on a modular component programming model. In time, these components will be interchangeable. For now, each interface comes complete with queries that handle both basic and complex analyses. The basic queries are the citation summaries, and the underlying dataset, as well as the needs of the user, determine which ones of these are most useful (see Figure 1). The more complex queries include: citation frequency distribution, citation time series, intercitation, and collaboration statistics, and a variety of manipulations at the article level. The Highly Cited Articles query, for example, is a listing of papers for the underlying dataset. For datasets that contain bibliographic information on the citing papers—such as the Personal Citation Report (PCR)—citing summary analysis of citing papers are included in addition to the other queries. All of this data for the interface can be updated every six months.

Figure 1

Another product that runs on the Article Summary Interface is High Impact Papers (HIPS). It is a bibliographic and citation database of the 300 most-cited papers of each year from 1981 through 1994. Neurosciences, immunology, and molecular biology and genetics are the three topics that are currently offered, but the Research Group can construct a High Impact Papers database for any field or topic.



Applications

The Article Summary Interface acts as a decision support system. Each of the queries available opens different analytical possibilities—each helping to answer specific types of questions. And, with the article-based interface, you can track trends and actually see the papers responsible for the trends.

The Highly Cited Articles query presents a list of all of the papers for the underlying dataset, and the papers are ranked by total number of citations received. By clicking on a specific article of interest, you can view additional information about the article (see Figure 2). The query can be restricted to monitor something specific, such as the number of highly cited articles in a specific journal. This would be very useful to the researcher who is considering where to publish the results of a study.

Figure 2

Journal editors themselves might benefit from using the Summary by Journal and the Article Type Summary query. Not only can editors check on their own journal to see which types of article—research article, proceedings papers, meeting abstracts, letters, or notes—garnered the highest number of citations, they could also check on the competition and perhaps adjust their article selection process.

The Summary by Author query is particularly useful for comparative evaluation of individual authors. It summarizes publication, citation, and citations-per-paper statistics for each author of each paper in a dataset. Additional evaluation is possible by accessing the Highly Cited Article query; total cites as well as expected cites are displayed here.

The Time Series query computes the number of papers, number of citations received, and the citations per paper for each year from 1981 through 1994 in the dataset. The statistics can also be divided into 5-year moving windows or a 14-year period. The data, which can be presented in table and graph form, portray performance for a specific period and therefore allow for comparisons between specific periods and for trend analysis.

The Citation Frequency Distribution query shows the number of articles cited at different frequencies, from zero to the maximum number of times a paper was cited in the defined period.

The Collaboration by Country and by Organization queries compute the number of papers produced by authors working in two countries or two organizations, respectively, and sort the country or organization pairs either by the number of coauthored papers or alphabetically by the name of the country or organization. These analyses reveal, for a subset of papers, a network of activity and cooperation.


Potential Users
The Article Summary Interface has many benefits for analysts. In general, it serves the needs of people who must monitor and track data on science at an aggregate level, including administrators, planning and development officers, and policy makers. Depending on the dataset, other users may include bench scientists, university provosts, librarians, or research & development managers in industry.

The user has the flexibility of presenting data in tables or graphs. Whether you use the data to gain a keener understanding of information yourself or use it to suggest a course of action to others, the interface provides a new perspective on bibliographic data.


Conclusions
The Article Summary Interface reveals valuable information hidden in a dataset. It makes the data accessible in a variety of new ways.

The interface is both easy to use and capable of answering complex analytical queries. In the future, new analytical options will appear in upgrades to the Article Summary Interface.


Dr. Henry Small
Director, Contract Research
Thomson Reuters

Margaret Ring Gillock
Science Writer