START
Select a generalist repository for your NIH-funded research data
Background
- The NIH Data Management and Sharing (DMS) Policy (effective January 2023) requires all NIH-funded researchers to share scientific data
- GREI supports eight generalist repositories to provide broad, discipline-agnostic data sharing options
- Discipline-specific repositories are preferred when available; generalist repositories serve as a universal fallback
Is there a designated or discipline-specific repository recommended for your data?
Check your funder requirements and field norms first
How to check
- Review your NIH Funding Opportunity Announcement (FOA) for specific repository requirements
- Consult the NIH Scientific Data Sharing website for discipline-specific repositories
- Check community standards in your field (e.g., GenBank for sequences, PDB for structures, dbGaP for controlled-access human genomic data)
- Ask your institutional library or data services team for guidance
Use the designated discipline-specific repository
This is the recommended first choice under NIH policy
Proceed to evaluate generalist repository options
Continue below
Does your institution have a data repository, preferred repository, or membership with a generalist repository?
What to look for
- Institutional data repository (e.g., university-hosted Dataverse installation)
- Preferred data repository listed in institutional data management policies
- Institutional membership with a generalist repository (e.g., DR Dryad, FI Figshare for Institutions)
- Contact your institutional library, research data services, or IT department
Use your institution's repository
Verify it meets NIH DMS Policy requirements
Choose from GREI generalist repositories
Evaluate based on your data requirements below
Does your dataset contain human subjects, clinical trial, PHI, or PII data?
This determines which repositories can host your data and what access controls are needed
Types of sensitive data
- Human subjects data: Data collected from human participants in research studies
- Clinical trial data: Individual participant-level data from clinical studies
- PHI (Protected Health Information): Data covered under HIPAA
- PII (Personally Identifiable Information): Data that can identify an individual
- Important: Data must be de-identified or anonymised before deposit in most generalist repositories
Repositories supporting individual-level human data
Anonymise or de-identify data before submission
Best practice: employ managed access controls
Managed access support by repository
- SY Synapse: User-enabled sharing settings and ACT controlled access
- VI Vivli: Fully managed access with Independent Review Panel (IRP)
- HD Harvard Dataverse: Request and grant access workflow
- FI Figshare: Private link and embargo features
- OS OSF: Request access and private sharing settings
- MD Mendeley Data: Managed access supported
- ZE Zenodo: Access request workflow for restricted content
- DR Dryad: No managed access (CC0 only, public data)
All eight GREI repositories are potential options
Evaluate based on features below
Does your data require managed access?
Control who can access your data and under what conditions
Do you need help complying with FAIR principles?
Findable, Accessible, Interoperable, Reusable
FAIR support options
- General deposit (free): All repositories provide basic metadata forms and DOI assignment
- Moderation and support: HD DR FI provide moderation of deposits
- Research environment: SY provides a collaborative research environment with built-in provenance tracking
- Independent Review Panel: VI provides expert review of data access requests for clinical trial data
What are your storage size requirements?
Repositories vary significantly in storage limits and associated costs
What licence options are suitable for your data?
Determines reuse conditions for your shared data
Common licence options
- CC0 (no restrictions): Maximises reuse. Required by DR Dryad. Supported by all repositories
- CC BY 4.0 (attribution): Requires citation. Supported by most repositories
- CC BY-NC (non-commercial): Restricts commercial reuse. Supported by MD FI
- Software licences (MIT, Apache, GPL, BSD): For code and tools. Supported by SY FI MD OS ZE
- Custom/other: OS and ZE support custom licence files
What curation and support services do you need?
Options range from free self-service to comprehensive paid curation
Service tiers
- General deposit (free): Self-service upload with basic metadata. All repositories
- Deposit moderation (free): Staff review of submissions. HD DR FI
- Research environment ($): Collaborative workspace with compute, provenance, governance. SY
- Independent Review Panel ($): Expert panel for managed access decisions. VI
Select your licence and deposit your data
Visit Creative Commons Licence Chooser for help selecting the right licence
Licence selection guidance
- CC0: Best for maximum reuse and discoverability. Required by Dryad. Recommended by NIH for most research data
- CC BY 4.0: Use when you want attribution. Good for datasets where citation tracking matters
- CC BY-NC: Restricts commercial use. Consider whether this aligns with NIH open-science goals
- Software licences: Use MIT, Apache 2.0, or BSD for code and analysis scripts. Use GPL if you want derivative works to remain open
Before depositing
- Ensure your data management and sharing plan (DMS Plan) is consistent with the repository and licence you select
- Verify that your chosen repository issues DOIs and supports the metadata standards required by your funder
- Check whether your institution has a preferred or mandated repository
- For sensitive data, confirm that access controls are in place before making any data available
END
Data deposited in a GREI-supported generalist repository