Features
This page highlights features for administrators and power-users of a Dataverse installation.
See What is Dataverse? to learn about its Core Capabilities for researchers if you’re new to Dataverse.
Artifical Intelligence
A number of AI tools integrate with Dataverse.
Model Context Protocol (MCP) is a standard for AI Agents to communicate with tools and services.
Access and Download
Facets are data driven and customizable per collection.
A preview is available for text, tabular, image, audio, video, and geospatial files.
Create a URL for reviewers to view an unpublished (and optionally anonymized) dataset.
Optionally collect data about who is downloading the files from your datasets.
Proprietary tabular formats are converted into TSV and RData for download.
Administration
Dashboard for common user-related tasks.
For number of files, amount of storage, etc.
Download counters, support for Make Data Count.
In-app and email notifications for access requests, requests for review, etc. can be muted.
Authentication
Single Sign On (SSO) using your institution’s credentials.
Log in using popular OAuth2 providers.
Log in using your institution’s identity provider or a third party.
Customization
Your installation can be branded with a custom homepage, header, footer, CSS, etc.
The Dataverse software has been translated into multiple languages.
Each personal or organizational collection can be customized and branded.
Embed listings of data in external websites.
FAIR Data Publication
Findable, Accessible, Interoperable, Reusable.
History of changes to datasets and files are preserved.
Datasets start as drafts and can be submitted for review before publication where curators can mark datasets with curation status labels.
Integrate with the Local Contexts platform, enabling the use of Traditional Knowledge and Biocultural Labels, and Notices.
File Management
Users are able to control dataset file hierarchy and directory structure.
Control who can download files and choose whether or not to enable a “Request Access” button.
Make files inaccessible until an embargo end date.
Make files inaccessible once the retention period set has passed.
Populate dataset metadata fields from tabular, NetCDF, HDF5, and FITS files.
Choose between filesystem or object storage, configurable per collection and per dataset.
After a permission check, files can pass freely and directly between a client computer and S3.
MD5, SHA-1, SHA-256, SHA-512, UNF.
Each data file can have any number of auxiliary files for documentation or other purposes (experimental).
Geospatial Data Support
There is a dedicated geospatial metadata block.
GeoJSON, GeoTIFF, and Shapefiles can be previewed as a map.
Pass geo_point and geo_radius to find datasets based on their bounding box.
Integrations
DOIs are reserved, and when datasets are published, their metadata is published to DataCite.
Handles are a Persistent ID (PID) that are an alternative to DOIs.
Upload from and download to Dataverse using Globus endpoints.
Exchange data and metadata with RSpace (e.g. IGSN ID). For example, a Data Management Plan (DMP) can be uploaded to RSpace and updated with the DOI of a Dataverse dataset.
A GitHub Action is available to upload files from GitHub to a dataset.
Pull data from an iRODS instance to a Dataverse dataset.
Upload files stored on Dropbox.
Datasets can be opened in Binder to run code in Jupyter notebooks, RStudio, and other computation environments. They can also be previewed in Dataverse itself.
Import files directly from Dataverse into Galaxy as well as publish datasets containing artifacts (Histories, datasets, etc.) from Galaxy to Dataverse.
Enable additional features not built in to the Dataverse software.
Dataverse integrates with a wide variety of third party systems, some of which are highlighted above.
Interoperability
Search API, Data Deposit API, Data Access API, Metrics API, Migration API, etc. and client libraries in various languages.
Serve and harvest metadata to and from other systems (e.g. DataCite, other Dataverse installations, etc.) using standardized metadata formats.
Used by Google Dataset Search and other services for discoverability.
Export metadata as linked data following the Croissant ontology.
Enable easier machine access to datasets by adding linkset in a Dataverse header.
Let users pick from external vocabularies (provided via API/SKOSMOS) when filling in metadata.
For preservation, bags can be sent to the local filesystem, Duracloud, and Google Cloud.
Export dataset metadata as an ro-crate.json.
Reusability
Users can select from multiple standard and provided custom licenses.
Users can write custom terms of use in place of a predefined license.
EndNote XML, RIS, BibTeX, or 1000+ CSL formats at the dataset or file level.
At the file level, upload standard W3C provenance files or enter free text instead.
Allow publication of a dataset to trigger external processes and integrations.