Responding to the NIH Data Management and Sharing Policy

What is this page?

On this page, you will find the definition of key terms related to the NIH Data Management and Sharing Policy. Definitions denoted with an asterisk (*) are taken directly from documentation disseminated by NIH. The rest were written or adapted by Lane Library staff in order to provide additional context for the policy and its requirements. 

If you have questions about any of the terms on this list or if there are terms you would like added, please contact John Borghi.

Key Terms and Definitions

Data Management* - The process of validating, organizing, protecting, maintaining, and processing scientific data to ensure the accessibility, reliability, and quality of the scientific data for its users.

Data Management and Sharing Plan* - A plan describing the data management, preservation, and sharing of scientific data and accompanying metadata.

Data Repository - A system that organizes, stores, and facilitates the discovery and reuse of research data.

Data Sharing* - The act of making scientific data available for use by others (e.g., the larger research community, institutions, the broader public), for example, via an established repository.

FAIR - A set of guiding principles that state that data should be findable, accessible, interoperable, and reusable. Addresses the usability of datasets as well as their availability.

Metadata* - Data that provide additional information intended to make scientific data interpretable and reusable (e.g., date, independent sample and variable construction and description, methodology, data provenance, data transformations, any intermediate or descriptive observational variables).

Persistent Identifier - A unique and long-lasting reference to a document, file, web page, or other object. There are PIDs for people (ORCID iDs), for resources (RRIDs), for data (DOIs, database accession numbers, etc).

Reproducibility - Often used more precisely (e.g. Goodman et al., 2016) but generally describes efforts to establish the the credibility, reliability, and validity of scientific research.

Scientific Data* - The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications. Scientific data do not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens.

Standard - A standard specifies how exactly data and related materials should be stored, organized, and described. In the context of research data, the term typically refers to the use of specific and well-defined formats, schemas, vocabularies, and ontologies in the description and organization of data. However, for researchers within a community where more formal standards have not been well established, it can also be interpreted more broadly to refer to the adoption of the same (or similar) data management-related activities or strategies by different researchers and across different projects.