Publication Tools

Range, Jan

Dataverse Repository Sync

This is a GitHub workflow that offers seamless synchronization between your repository and your Dataverse dataset. This workflow empowers you to manage your data with greater ease and efficiency. Pushes all your repository files to your dataset Detects any changes and updates your dataset files accordingly Removes any files from your dataset that are not in your repository Lets you push content to any directory within your dataset. Besides being a publication, this dataset also serves as an example/test. It was created using the workflow outlined here and any updates made to the repository will be reflected in this dataset.

python

data management

dataverse

github

synchronization

upload

Range, Jan

EasyDataverse

EasyDataverse is a Python libary used to interface Dataverse installations and generate Python code compatible to a metadatablock configuration given at a Dataverse installation. In addtion, EasyDataverse allows you to export and import datasets to and from various data formats. Features - Metadataconfig compliant classes for flexible Dataset creation. - Upload and download of files and directories to and from Dataverse installations. - Export and import of datasets to various formats (JSON, YAML, XML and HDF5). - Fetch datasets from any Dataverse installation into an object oriented structure ready to be integrated.

python

api

data management

dataverse

Range, Jan

EasyReview

EasyReview is a website tool designed to help review datasets stored in Dataverse. Datasets are an important part of RDM to guarantee high quality date. EasyReview breaks down the review process by looking at each field of the dataset individually. Reviewers can be easily invited to join the review through a provided link, and can share their feedback with others involved in the process.

javascript

python

dataverse

django

next.js

postgresql

quality control

react.js

Roy, Sarbani; Wang, Fangfang; Gläser, Dennis

Harvester-Curator, a tool to elevate metadata provision in data and/or software repositories

Harvester-Curator is a tool, designed to elevate metadata provision in data repositories. In the first phase, Harvester-Curator acts as a scanner, navigating through user code and/or data repositories to identify suitable parsers for different file types. It collects metadata from each of the files by applying corresponding parsers and then compiles this information into a structured JSON file, providing researchers with a seamless and automated solution for metadata collection. Moving to the second phase, Harvester-Curator transforms into a curator, leveraging the harvested metadata to populate metadata fields in a target repository. By automating this process, it not only relieves researchers of the manual burden but also ensures the accuracy and comprehensiveness of the metadata. Beyond its role in streamlining the intricate task of metadata collection, this tool contributes to the broader objective of elevating data accessibility and interoperability within repositories.

python

automation

data submission, annotation, and curation

metadata harvesting

Range, Jan

Python DVUploader

Python equivalent to the DVUploader written in Java. Complements other libraries written in Python and facilitates the upload of files to a Dataverse instance via Direct Upload. - Parallel direct upload to a Dataverse backend storage - Files are streamed directly instead of being buffered in memory - Supports multipart uploads and chunks data accordingly

python

data management

dataverse

file uploader