GREI Repository Selection Flowchart

Generalist Repository Ecosystem Initiative: A decision guide for selecting the right NIH-supported generalist repository for your research data

Version 3.2 | Last updated: 15 May 2026 | Source: NIH Office of Data Science Strategy

Start / End
Decision Point
Action
Sensitive Data
FAIR Compliance
Storage
Licensing
Information
Make Data Count
1

START

Select a generalist repository for your NIH-funded research data

Background

  • The NIH Data Management and Sharing (DMS) Policy (effective January 2023) requires all NIH-funded researchers to share scientific data
  • GREI supports eight generalist repositories to provide broad, discipline-agnostic data sharing options
  • Discipline-specific repositories are preferred when available; generalist repositories serve as a universal fallback
2

Is there a designated or discipline-specific repository recommended for your data?

Check your funder requirements and field norms first

How to check

  • Review your NIH Funding Opportunity Announcement (FOA) for specific repository requirements
  • Consult the NIH Scientific Data Sharing website for discipline-specific repositories
  • Check community standards in your field (e.g., GenBank for sequences, PDB for structures, dbGaP for controlled-access human genomic data)
  • Ask your institutional library or data services team for guidance
Yes

Use the designated discipline-specific repository

This is the recommended first choice under NIH policy

No

Proceed to evaluate generalist repository options

Continue below

3

Does your institution have a data repository, preferred repository, or membership with a generalist repository?

What to look for

  • Institutional data repository (e.g., university-hosted Dataverse installation)
  • Preferred data repository listed in institutional data management policies
  • Institutional membership with a generalist repository (e.g., DR Dryad, FI Figshare for Institutions)
  • Contact your institutional library, research data services, or IT department
Yes

Use your institution's repository

Verify it meets NIH DMS Policy requirements

No

Choose from GREI generalist repositories

Evaluate based on your data requirements below

Data Sensitivity Assessment
4

Does your dataset contain human subjects, clinical trial, PHI, or PII data?

This determines which repositories can host your data and what access controls are needed

Types of sensitive data

  • Human subjects data: Data collected from human participants in research studies
  • Clinical trial data: Individual participant-level data from clinical studies
  • PHI (Protected Health Information): Data covered under HIPAA
  • PII (Personally Identifiable Information): Data that can identify an individual
  • Important: Data must be de-identified or anonymised before deposit in most generalist repositories

Repositories supporting individual-level human data

  • SY Synapse: Full support for de-identified human data in a FISMA Moderate environment, with tiered access controls
  • VI Vivli: Specialised in anonymised individual participant-level clinical trial data with managed access
Yes

Anonymise or de-identify data before submission

Best practice: employ managed access controls

Managed access support by repository

  • SY Synapse: User-enabled sharing settings and ACT controlled access
  • VI Vivli: Fully managed access with Independent Review Panel (IRP)
  • HD Harvard Dataverse: Request and grant access workflow
  • FI Figshare: Private link and embargo features
  • OS OSF: Request access and private sharing settings
  • MD Mendeley Data: Managed access supported
  • ZE Zenodo: Access request workflow for restricted content
  • DR Dryad: No managed access (CC0 only, public data)
No

All eight GREI repositories are potential options

Evaluate based on features below

FAIR Compliance and Access
5

Does your data require managed access?

Control who can access your data and under what conditions

Types of managed access

  • Pre-publication embargo only: Supported by all eight repositories
  • User-managed access: Researcher controls who can access. Supported by: HD FI OS MD ZE
  • Repository-managed access: Repository governance manages decisions. Supported by: SY (ACT) VI (IRP)
6

Do you need help complying with FAIR principles?

Findable, Accessible, Interoperable, Reusable

FAIR support options

  • General deposit (free): All repositories provide basic metadata forms and DOI assignment
  • Moderation and support: HD DR FI provide moderation of deposits
  • Research environment: SY provides a collaborative research environment with built-in provenance tracking
  • Independent Review Panel: VI provides expert review of data access requests for clinical trial data
Technical Requirements
7

What are your storage size requirements?

Repositories vary significantly in storage limits and associated costs

Storage limits by repository

  • < 50 GB: All repositories (DR FI MD OS ZE support this tier for free)
  • < 300 GB: DR (up to 300 GB browser, 1 TB with assistance), SY (free up to 100 GB)
  • < 1 TB: HD (1 TB per researcher), VI (included)
  • > 1 TB: SY (service plans), FI (Figshare+ up to 10 TB+)
8

What licence options are suitable for your data?

Determines reuse conditions for your shared data

Common licence options

  • CC0 (no restrictions): Maximises reuse. Required by DR Dryad. Supported by all repositories
  • CC BY 4.0 (attribution): Requires citation. Supported by most repositories
  • CC BY-NC (non-commercial): Restricts commercial reuse. Supported by MD FI
  • Software licences (MIT, Apache, GPL, BSD): For code and tools. Supported by SY FI MD OS ZE
  • Custom/other: OS and ZE support custom licence files
Services and Support
9

What curation and support services do you need?

Options range from free self-service to comprehensive paid curation

Service tiers

  • General deposit (free): Self-service upload with basic metadata. All repositories
  • Deposit moderation (free): Staff review of submissions. HD DR FI
  • Research environment ($): Collaborative workspace with compute, provenance, governance. SY
  • Independent Review Panel ($): Expert panel for managed access decisions. VI
10

Select your licence and deposit your data

Visit Creative Commons Licence Chooser for help selecting the right licence

Licence selection guidance

  • CC0: Best for maximum reuse and discoverability. Required by Dryad. Recommended by NIH for most research data
  • CC BY 4.0: Use when you want attribution. Good for datasets where citation tracking matters
  • CC BY-NC: Restricts commercial use. Consider whether this aligns with NIH open-science goals
  • Software licences: Use MIT, Apache 2.0, or BSD for code and analysis scripts. Use GPL if you want derivative works to remain open

Before depositing

  • Ensure your data management and sharing plan (DMS Plan) is consistent with the repository and licence you select
  • Verify that your chosen repository issues DOIs and supports the metadata standards required by your funder
  • Check whether your institution has a preferred or mandated repository
  • For sensitive data, confirm that access controls are in place before making any data available
E

END

Data deposited in a GREI-supported generalist repository

🤖 Repository Selection Assistant