Responding to the NIH Data Management and Sharing Policy

How to Manage and Share Data

This page provides guidance on how to share data to be in compliance with NIH Data Management and Sharing Policy. This policy expects that researchers will "maximize appropriate data sharing".

Lane Library maintains a Data Management and Sharing guide, which provides a general introduction to these topics. For additional information or to schedule a consultation, please contact John Borghi.

Sharing Data via a Repository

How (and where) you share your data will depend on its characteristics, contents, and any related ethical, legal, and social issues.

Datasets that are especially large or contain sensitive information can not be shared in the same way as datasets that are (comparably) smaller and are associated with less risk. The NIH encourages the use of established repositories for preserving and sharing scientific data. However, NIH supports many data repositories, it does not necessarily provide data repositories to preserve and share all data resulting from the research it funds.

The figure below is designed to help you navigate the ever-changing data repository ecosystem. The registry of research repositories (re3data) is an extremely helpful resource for identifying repositories that are specialized for certain types of data. For more information refer to NIH's supplemental guidance

A good rule of thumb when sharing data is to put it somewhere where it will be found by other researchers.

Why we don’t recommend sharing data “upon reasonable request”

Making data and other materials available “upon request” means that the requester must contact a member of the research team (often a corresponding author) and a team member must have the data on-hand (in a usable format) in order to respond to the request.

Over time- as contact information changes, team members move on, and data is archived- both requesting data and responding to requests become more difficult.

Why we don’t recommend sharing data as a supplementary material

Supplementary materials are an important part of the scholarly record, but it is not uncommon for links between them and the articles they are associated with to break down. Whenever feasible, we recommend uploading data into a repository that is designed to preserve and make data accessible to others and then link to and/or citing that dataset in your manuscript.

Why we don’t recommend sharing data through a website

Making data and other materials available through a website (i.e. a lab website) can be a relatively straightforward method of making data available. However, there is no guarantee that it will be easily discoverable by others. Furthermore, unlike data repositories- which often have strategies for preserving data over the long term- there is no guarantee that data made available through a website will still be available if the website goes down or moves.

Dryad

The Dryad Digital Repository is a curated resource that makes research data discoverable, reusable, and citable. Dryad provides a home for a wide range of data types and is free to use for all Stanford affiliated researchers.

Key features of Dryad:

Flexible about file format, meaning you can upload your datasets in whatever form they take.
Automatically assigns digital object identifiers, meaning researchers will be able to easily cite your datasets.
Curated by experts, meaning that somebody at Dryad will check to ensure your files can be opened, you haven't inadvertently shared sensitive data, and that you have included sufficient descriptive information for another researcher to find and use your datasets.
Contents are preserved for the long term, meaning your datasets will be accessible indefinitely.

See their FAQ page for additional information about Dryad's features.

There are a variety of models and potential platforms for sharing your datasets with other researchers. Lane Library recommends Dryad as a way to openly share datasets that do not fit into more specialized repositories. For more information about Dryad, contact your liaison librarian.

Dryad is free for Stanford Affiliated Users

Dryad uses ORCID iDs for login. The first time you log in, you will be asked if you are affiliated with a member institution. After selecting Stanford from the drop-down menu, you will be asked to sign in using your Stanford credentials. On every subsequent login, you will only have to use your iDs.

Publish and Share your Data

Enter Metadata

Once you have logged into Dryad, you can begin the process of publishing and sharing your data. After clicking Start New Dataset, you will be prompted to begin entering metadata. Good metadata (also called data documentation) is vital for ensuring that your dataset can be discovered, understood, and used by other researchers.

Dryad only requires that you complete the title, authors, and abstract fields, but we strongly recommend that you complete every field and upload additional documentation (e.g. data dictionaries, readme files, etc) alongside your dataset.

Upload Methods

Dryad has two different methods for uploading data. Both methods allow you to upload multiple files.

Upload directly from your computer: For uploads less than 10gb.
Upload from a server or the cloud. For uploads up to 300gb.

Curation

Once you've uploaded your files, you can decide to submit them to the curation process immediately or keep them temporarily private for peer review. During the curation process, expert curators perform basic checks to ensure that the title and abstract are meaningful, there are sufficient methods and usage notes, that files can be opened, and that no sensitive information of material subject to copyright restrictions have been inadvertently included in the dataset. As an author, you can review the curation process for your dataset.

Describing Dryad

If you are plan to use Dryad to publish and share your data, please feel free to use or adapt the following description when completing data management plans or other documents:

Stanford University is a Dryad member institution. Dryad is an open source tool for data publication and digital preservation. Datasets deposited into Dryad are permanently archived in a CoreTrustSeal-certified repository. Data files are regularly audited to ensure fixity and authenticity and are replicated with multiple copies in multiple geographic locations. Professional curators examine all Dryad deposits to ensure the validity of the data, apply robust metadata, and make certain that highly sensitive information has not been inadvertently included. Datasets deposited in Dryad are automatically assigned a Digital Object Identifier (DOI) and are indexed by Google Dataset Search and other tools to enhance discoverability.

More information about Dryad's features, see this page. For additional assistance in describing Dryad or to discuss how it can be integrated into your research workflow, contact your liaison librarian.

Increasingly, there is an expectation that researchers will share their data. Data sharing can be a complex endeavor and, though we think very highly of Dryad, Lane Library recommends that you choose the method for sharing that is right for you and your data. Answering the questions below will help guide you through this process. For additional assistance, please see our upcoming classes and events page for workshops related to data management and sharing or contact your liaison librarian.

Do the groups that fund or publish your work specify where your data should be shared?

In some cases, your research funder or the journal publishing your work will specify that your data should be shared through a specific repository. For example, some projects funded by the National Institute of Mental Health are expected to share their data through NIMH Data Archive. In cases like this, we recommend that you share your data through the required repository.

Please note that some requirements state that data should be shared, but do not specify where. In such cases, refer to the next question.

Do researchers who work with similar data typically share it in a certain place?

If your research community typically shares the type of data you are looking to share through a specific repository, we generally recommend that you use that repository. To find repositories specialized for particular types of data, we recommend searching the Registry of Research Data Repositories (Re3Data).

If there is not a repository that is specific to the type of data your working with or if you have other concerns about sharing your data, see the next question.

Are there particular characteristics of your data that you think might affect how it can be shared?

Certain characteristics of your data may determine how and where it can be shared. For example, if you are working with big data (over 300 GB) or data that contains personally identifying information, we recommend scheduling a consultation with your liaison librarian so we can refer you to the appropriate group on campus to help you determine your options for making your data available.

However, if you are simply looking for a general-purpose data repository, we strongly recommend Dryad. Stanford Libraries also maintains the Stanford Digital Repository (SDR) which is recommended to Stanford University affiliates.