Research Guides: Data Management and Sharing: Checklist

Using This Checklist

This checklist on this page is intended to help you get started integrating data management into your research practice.

This checklist can help you identify gaps and communicate elements of data management to members of your research team. It is recommended that you apply this checklist to an individual research project as practices and procedures may vary considerably between projects. This checklist is not intended to cover every single aspect of data management across all types of research.

It is likely that certain data management practices that are specific to your research, your type(s) of data, and your needs as a researcher are not covered. It is also possible that certain items on the checklist will not apply to the specific type(s) of data you are working with.

Please feel free to modify this checklist or adapt it to better fit your needs.

A version of this guide is attached to the following publication:

Borghi, J. A., & Van Gulick, A. E. (2022). Promoting Open Science Through Research Data Management. Harvard Data Science Review, 4(3). https://doi.org/10.1162/99608f92.9497f68e

Data Management Checklist
A downloadable version of the data management checklist that was developed at Lane Medical Library.

	We have reviewed all applicable policies from Stanford University, including the data access and retention policy, Stanford’s risk classification system, and the departing personnel policy. If applicable, we have completed a data risk assessment.
	We have read through and understand other relevant agreements, licenses, or other requirements related to our data (e.g. data use agreements, IRB or funder policies).
	Research team members have ORCID iDs that can be applied to the products of this research process (e.g. papers, datasets, etc).
	We have sought out community standards and best practices related to our data.
	We have discussed the intended products of this project (papers, datasets, software tools, etc) and have decided to what extent we will be making our data and other materials available to others.

	We maintain documentation that describes the type(s) of data we are collecting/analyzing/working with over the course of the project as well as details about materials that are needed to understand and use the data (documentation, code, etc).
	We maintain documentation that describes the specific data management practices (e.g. file naming, backing up data) employed throughout the course of this project.
	We maintain documentation that outlines the roles and responsibilities of individual team members related to managing data (e.g. maintaining good documentation, following file naming conventions, etc) as well as who is ultimately responsible for ensuring the data is properly managed throughout the course of the project and following its conclusion.
	Members of the research team have access to the above documentation and review it periodically.

	We have a standardized set of practices related to saving datasets and other project materials while we are working with them (e.g. digital data is saved on a lab server).
	Our practices related to saving data are in line with Stanford’s risk classification system and, when possible and appropriate, include multiple backups.
	We have standardized conventions for naming project-related objects and files (including datasets) that enable us to quickly identify the materials we are looking for.
	We have standardized systems for organizing project-related objects or files that enable us to easily find the materials we are looking for (e.g. a standardized file structure).
	When applicable, we have standardized systems for naming and organizing information within our data files (e.g. standardized variable names, tidy spreadsheets).
	Our practices related to saving, organizing, and describing data files have been optimized to facilitate quality control.
	Our practices related to saving, organizing, and describing data files are in line with community standards and best practices.

	We maintain documentation that describes how we keep datasets and other materials organized while we are working with them (e.g. naming conventions, file structures, etc).
	We have standardized procedures for documenting the structure and contents of individual data files (e.g. maintaining codebooks, data dictionaries, etc).
	We have standardized procedures for documenting project-related decisions, steps, procedures, and workflows (e.g. maintaining protocols, lab notebooks, etc).
	We have standardized procedures for saving and versioning research-related code and other elements of the research process (e.g. workflows, software containers).

	Members of the research team are actually following the practices and procedures we have decided upon.
	Study documentation is updated regularly to reflect any changes to data management-related practices and procedures.
	We have established procedures for onboarding new team members about our data management practices, educating members about changes to existing practices, and managing data as team members move onto new projects (that, when appropriate, are in compliance with the departing personnel policy).
	We have established procedures for onboarding new team members about our data management practices, educating members about changes to existing practices, and managing data as team members move onto new projects.

Data Management and Sharing

In this Guide

Using This Checklist

Data Management Checklist

We have done the following at the beginning of our project:

We have a plan:

We are keeping our data organized:

We are keeping good records:

We have done (or will do) the following before the end of the project:

We are checking up on ourselves: