Bringing historical data into a modern format

In 1864, a group of scientists affiliated with the Zoological Society of London and the British Museum founded Zoological Record as a way to.communicate amongst themselves. As others in the started to use this resource, its content was expanded and the Zoological Society of London assumed.complete responsibility for its publication as of 1886.

In 1980, the Zoological Society and BioSciences Information Service of Biological Abstracts (BIOSIS) joined forces to produce and publish Zoological Record, and today, BIOSIS (now part of Clarivate Analytics) is the sole publisher. BIOSIS introduced.computerization and more content, and made issues from 1978 on available in both print and electronic formats.

Over the years, Zoological Record has expanded and adapted to the needs of its users, and today is recognized as the leading and most.comprehensive source for biodiversity, systematics, and zoological information.


The purpose:

Building an archival record with unique value for biodiversity and taxonomic researchers

As the oldest continuing bibliographic database in life sciences, Zoological Record offers a unique value: access to nearly 150 years of original descriptions of new animal species, as well as all subsequent nomenclature changes.

Zoological Record covers the entire animal kingdom, including living and fossil species, making it the primary animal names repository.

Keith MacGregor, Executive Vice President, Scientific and Scholarly Research, Clarivate Analytics explains why current practice within biodiversity and taxonomy make this archival record essential:

“When a new animal is discovered, the rules within the International Code of Zoological Nomenclature dictate a unique Latin name must be assigned. It must be assured that the name was never used before and the animal was never seen before. The only way to assure this is to thoroughly search all previous similar descriptions, regardless of publication date or origin.” The full Zoological Record collection, both archival and current, provides all this valuable information in one place, digitized and re-indexed with modern terms and thoroughly searchable and linkable.


The challenge:

Creating a searchable archive covering over 100 years of data

The Clarivate Analytics production facility in York, U.K. was responsible for the data manipulations of over 100 years of Zoological Record. Nigel Robinson, Director, TZL Operations & Development, who oversaw the.compilation, indexing, and digitization of this extensive archive, knew that “mapping over a century of controlled vocabulary and indexing brings it own unique challenges. Our main goal was to get the print archives as close as possible to our current electronic version, both in format and content. Zoological Record Archive will enable users to seamlessly conduct searches involving newer and older records.”

Bringing historical data into a modern format

The original Zoological Record records were assembled and edited by scientists such as Albert Günther and Charles Darwin, who later became leaders in their fields. These contributors often added invaluable synopses and summaries that lend a more.complete understanding of the articles.

However, the format of the print volumes and past editorial practices created a challenge. Unlike today, there were no universally accepted editorial standards at the time these records were originally created. Zoological Record indexers and editors encountered frequent format changes – from year to year and even editor to editor within a single issue. Perceiving the original editors’ intentions and framing the data in a way that is useful to today’s life sciences researchers was a significant challenge.

As Robinson explains: “Each print volume of Zoological Record has 15-25 different sections. And each section is devoted to one or several taxonomic groups: birds, fish, insects, etc. Each of these sections may have had a different editor. And each editor may have followed different formatting rules. In addition, there are two indexes – subject and animal name – that one can search to find the bibliographic record. In the electronic version, all this material – and more – is in one record.” To achieve a consistent output that resembles current standards, highly.complex algorithms were needed to extract and reformat XML tagged text from around 160,000 pages of the original volumes.

Benefiting from past experience

Subject experts at the York facility – many of whom have decades of experience – were responsible for the crucial data manipulation, tagging and indexing that brought this historical data into the digital age. Fortunately, they could draw on recent Clarivate Analytics experience – the Century of Science™ project, which digitized Web of Science™ Core Collection backfiles to 1900. For both projects, OCR (Optical Character Recognition) scanning proved to be the method that provided more easily searchable records than searchable PDFs would have. And the.combination of human term mapping and data manipulation with.computerized scanning assured a highly accurate, usable data file.

In many ways, digitizing original Zoological Record files was much easier than the Century of Science project proved to be. Journal selection, location and indexing were not issues, since it was already known which records to include for this project: all of Zoological Record from 1864-1977. All volumes were either already in-house, or available via an established and valued partner, the Zoological Society of London, who assisted with the project with the loan of over 100 years of Zoological Record on microfilm.


The result:

An essential collection

“Researchers in taxonomy and biodiversity need to refer back to older literature to track changes and review original descriptions of specimens as first described,” states MacGregor. “With Zoological Record Archive, they can get all this essential information in one place, in one search.”


The American Museum of Natural History: A case study

The AMNH Digital Library was launched in 1999 to develop an integrated database of library resources and natural history collections. The American Museum of Natural History created this digital library to enable scientists, scholars, and educators working anywhere in the world to study unique and rare research materials from the Museum’s Library and scientific collections. The first major project of the Digital Library focuses on the Museum’s Congo Expedition collection, which consists of photographs, artifacts, documents, and animals from 1909 – 1915.

As a precursor to the full Zoological Record digitization project, material from Zoological Record was provided to the Museum to support the Congo Expedition Collection. As more collection and library holdings are digitized, it will be possible to create links to the full text of many resources in the Zoological Record Archive.