REACTION SIMILARITY AND RETRIEVAL

This article was first published when Thomson Reuters was known as the Institute for Scientific Information. The essay introduced Reaction Citation Index, the first product to combine searching reactions along with citation searching/linking. This can now be performed on Web of Science using its chemistry component. The Current Chemical Reactions statistics were updated in January 2006.

That there is no completely adequate indexing system for chemical reactions has been amply noted in the literature. 1, 2, 3The explanation commonly given is that there is no one attribute or fixed set of attributes on which a satisfactory index can be based. Reactions can be assigned to a number of broad categories describing a specific structural change (e.g., deesterification or cycloaddition). They can be classified by product, by starting material, by catalyst, by experimental conditions, by utility—by any of a number of distinguishing (and nameable) factors. 3

In recent years, computer-based retrieval systems have gone far in compensating for the lack of an adequate indexing system for reactions. Using Boolean logic to probe information stored electronically, chemists can search for reactions according to the union or intersection of multiple attributes (e.g., asymmetric synthesis AND chloroperoxidase). But the most significant advance has no doubt been graphics-based retrieval systems that enable the chemist to form queries using the international ideographic language of chemistry—the structural diagram. The most sophisticated of these allow for similarity searching as well as exact-match retrieval.

The Reaction Citation Index

I've often discussed the advantages of a citation index for retrieving chemical information.4Citation-based retrieval frees the researcher from the limitations of nomenclature. It exploits the conceptual links between articles that are established by the authors themselves. W. Todd Wipke and my late close colleague George E. Vladutz studied this approach to reaction retrieval in some detail. They observed that" best reaction.?3

Thomson Reuters Reaction Citation Index™, RCI™ combines a database of reactions, the Current Chemical Reactions® database, with citation data from the Science Citation Index®.

CCR®, with over 645,000 reactions taken from 180,000 source articles or patents, is a sizable file in its own right. But as part of the RCI, CCR functions as a gateway to more than 19 million references to articles published in 350 journals spanning 24 years, from 1981 to the present.

This product uses MDL® ISIS for a search and retrieval platform. The citation data are stored as relational records in an Oracle® file. Together, these software systems interweave two data sources—a reaction file and a citation index—into one powerful yet supple reaction management system.

 

The Best of Both Worlds

The RCI retains all of the functionality found in today's state-of-the-art reaction retrieval systems. Chemists can search for reactions by structure and reaction fragments, as well as by text such as title words. But then, once an article is retrieved, the researcher gains instant access to the conceptual predecessors and conceptual successors to the retrieved item (see Figure 1).

Figure 1
Tabs simply marked PAST and FUTURE lead to these cited and citing references. REVIEWS and CORRECTION are subsets of FUTURE references.

The RELATED tab leads to conceptual neighbors—articles that are closely related to the retrieved article through the references they share with that article. Through Related Records®, the chemist can identify relevant papers that may not be retrieved by the initial structure- or text-based search.

Hypernavigating the Seas of Synthetic Research

Clicking on a reference from the PAST, FUTURE, REVIEWS, CORRECTION, or RELATED takes the user immediately to a different view, which centers on the selected article. For example, clicking on the second of the RELATED references changes the ISIS window.

Now, the user can see the past, future and related references for this article on asymmetric coupling of arylmagnesium bromides with allylic esters (see Figure 2).

Figure 2
As hypertext links, citations set up a dynamic interplay between user and data source. At any given point, the user can make a different paper a new starting point for further exploration and discovery. What's more, the user can move forward and backward in time to take advantage of the progressive nature of chemical research.

This temporal dimension of the RCI is as informational as the individual references and reactions. At a glance, chemists can see the prior research on which a new synthetic method is based, as well as the later developments it spawned.

Eugene Garfield, Ph.D.
Chairman Emeritus


References

1. Willett P. The reaction indexing problem: a historical viewpoint. Modern approaches to chemical reaction searching. Proceedings of a conference organized by the Chemical Structure Association at the University of York, England, 8-11 July 1985. Ed. P. Willett. Aldershot, UK: Gower, 1-17, 1986.

2. Bawden D. Classification of chemical reactions: Potential, possibilities, and continuing relevance. J. Chem. Inf. Comput. Sci. 31:212-6, 1991.

3. Wipke W T, Vladutz G. An alternative view of reaction similarity: Citation analysis. Tetrahedron Computer Methodology 3:83-107, 1990.

4. Garfield E. History of citation indexes for chemistry: a brief review. J. Chem. Inf. Comput. Sci. 25:170-4, 1985.