DATA REPOSITORIES VERSUS DATA SETS

The Data Citation Index captures all available metadata for the data repositories we index. In many cases, this available metadata is very granular and the repository is broken into a variety of child data types (studies, sets). In other instances the content will only appear as a single repository record. This can be caused by two separate scenarios – one that will change over time and one that will remain as indexed at launch.

In some cases a data repository is still working to create a uniform structure that will enable its content to be indexed at the granular level. Through our evaluation and selection process, Thomson Reuters determined that because the content in the repository is so critical to Web of Knowledge users and because the repository is working with us to implement a more consistent data structure, the data would be made available within the Data Citation Index as the repository formatting work is underway.

In many other instances, however, the data repository will always only include a single record in Data Citation Index with no affiliated child records unless the repository itself makes significant architectural changes – and its users in turn change the way they interact with the repository. These repositories are essentially one large record and the citable object is the repository itself. When thinking about these repositories, data set records and data studies do not exist. Usually either the repository is one big collection of data (maybe from a specific project) or bespoke data are exported as some sort of report from the repository by the user entering various parameters, and an internal repository search pulls together appropriate data based on a variety of criteria. The nature of these repositories is to be a large trunk filled with data related to a specific subject or topic as opposed to a neatly organized cabinet with several separate drawers and shelves.

COVERAGE OF REGIONAL REPOSITORIES

While data repositories are often affiliated with a specific institution or organization, their content can be global in origin. Researchers from around the world tend to select the data repository that is most applicable to the subject area they are examining as opposed to one based on geographical proximity. For the first phase of the Data Citation Index we have identified data repositories that have the some of the most relevant, widely applicable data and prioritized these for early stage inclusion.
However, there is likely to be truly regional content only available within regional repositories that is important to our customers. As with all of our products, we will closely monitor usage trends and feedback from our customers to ensure our content strategy aligns with the market’s needs.

We have aggressive goals associated with indexing additional data repositories each year as the Data Citation Index develops from an essential database of data sets and studies to a fully integrated web of citation and analytics. Part of this will include monitoring these regional needs and determining the best way to meet researchers’ expectations. As with regional citation indexes on Web of ScienceSM, we will look for the most relevant content from reliable, sustainable data repositories to add content and context to the Data Citation Index as the product matures.


Repository

Discipline

Responsible Organization 

Archaeological Data Service Social Sciences University of York
Array Express Life Sciences European Bioinformatics Institute
Association of Religion Data Archives Arts and Humanities Pennsylvania State University
Australian Antarctic Data Centre Climatology Australian Government; Department of Sustainability, Environment, Water, Population and Communities
Australian Data Archive Social Sciences Australian National University
BioMagResBank Life Sciences University of Wisconsin
British Antarctic Survey Physical Sciences Natural Environment Research Council
British Atmospheric Data Centre Atmospheric Science Natural Environment Research Council
British Geological Survey Physical Sciences Natural Environment Research Council
British Oceanographic Data Centre Physical Sciences Natural Environment Research Council
CaArray Life Sciences National Cancer Institute
caNanoLab Life Sciences National Cancer Institute
CanGEM Life Sciences University of Helsinki
CEBS Life Sciences The National Institute of Environmental Health Sciences
Cell Centered Database Neuroscience University of California
Centre for Ecology and Hydrology Physical Sciences Natural Environment Research Council
Codex Sinaiticus Arts and Humanities The British Library/ Leipzig Univeristy Library/ St. Catherine's Monastery/ The National Library of Russia
CPLA Life Sciences University of Science and Technology of China
Crystallography Open Database Physical Sciences Vilnius University*
Disprot Life Sciences Indiana University School of Medicine/ Temple University
DrugBank Life Sciences University of Alberta*
Dryad Life Sciences National Evolutionary Synthesis Center
EcoGene Life Sciences University of Miami
eCrystals Crystallography University of Southampton
Emage Life Sciences Medical Research Council
EMDB Life Sciences European Bioinformatics Institute
Esther Life Sciences French National Institute for Agricultural Research*
Eurostat Social Sciences European Union
Finnish Social Science Data Archive Social Sciences University of Tampere
Gene Expression Omnibus Genetics National Center for Biotechnology Information
Greengenes Life Sciences Lawrence Berkeley National Laboratory
GWAS Central Life Sciences University of Leicester*
Infevers Life Sciences Institute of Human Genetics*
Inter University Consortium for Political and Social Research Social Sciences University of Michigan
IQSS Social Sciences Harvard University
Michigan Corpus of Academic Spoken English Arts and Humanities University of Michigan
Microkit Life Sciences University of Science and Technology of China
miRBase Life Sciences University of Manchester
Mouse Phenome Database Life Sciences The Jackson Laboratory
National Archives Social Sciences U.S. National Archives and Records Administration
National Snow and Ice Data Centre Environmental Science University of Colorado, Boulder
NERC Earth Observation Data Centre Physical Sciences Natural Environment Research Council
nmrshiftdb2 Chemistry Johannes Gutenberg University*
NOAA Paleoclimatology Physical Sciences National Oceanic and Atmospheric Administration
Nucleic Acid Database Life Sciences Rutgers, The State University of New Jersey
Oak Ridge National Laboratory Distributed Active Archive Center Multi-discipline U.S. National Aeronautics and Space Administration
Odum Institute Social Sciences Odum Insitute, University of North Carolina
Office for National Statistics Social Sciences UK Statistics Authority
Old Bailey Proceedings Online Arts and Humanities Humanities Research Institute
Pangaea Earth Sciences Alfred Wegener Institute for Polar and Marine Research/ Center for Marine Environmental Sciences, University of Bremen
PHI-base Life Sciences Rothamsted Research
Protein Data Bank Life Sciences Research Collaboratory for Structural Bioinformatics
Pseudobase Life Sciences Institute of Theoretical Biology/ Leiden Institute of Chemistry, Leiden University
QTL Archive Life Sciences The Jackson Laboratory
Reading Experience Database Arts and Humanities The Open University
Refold Life Sciences Monash University*
Roper Center Social Sciences Roper Center, University of Connecticut
Sloan Digital Sky Survey Astronomy Astrophysical Research Consortium
South African Data Archive Social Sciences National Research Foundation
Stanford Microarray Database Genetics Stanford School of Medicine
Tardis Physical Sciences Monash University
The Cell: An Image Library Life Sciences The American Society for Cell Biology
The Dataweb Social Sciences US Census Bureau
TreeBASE Life Sciences National Evolutionary Synthesis Center
U.S. National Oceanographic Data Center Physical Sciences United States Department of Commerce
UK Data Archive Social Sciences University of Essex
Uniprobe Life Sciences Bulyk Laboratory (Department of Medicine at Brigham and Women's Hospital/ Harvard Medical School)
Uniprot Life Sciences European Bioinformatics Insitute/ Swiss Institute of Bioinformatics/ Protein Information Resource
World Values Survey Social Sciences World Values Survey Association