DataFed: Towards Reproducible Research via Federated Data Management

04/07/2020
by   Dale Stansberry, et al.
0

The increasingly collaborative, globalized nature of scientific research combined with the need to share data and the explosion in data volumes present an urgent need for a scientific data management system (SDMS). An SDMS presents a logical and holistic view of data that greatly simplifies and empowers data organization, curation, searching, sharing, dissemination, etc. We present DataFed – a lightweight, distributed SDMS that spans a federation of storage systems within a loosely-coupled network of scientific facilities. Unlike existing SDMS offerings, DataFed uses high-performance and scalable user management and data transfer technologies that simplify deployment, maintenance, and expansion of DataFed. DataFed provides web-based and command-line interfaces to manage data and integrate with complex scientific workflows. DataFed represents a step towards reproducible scientific research by enabling reliable staging of the correct data at the desired environment.

READ FULL TEXT

page 3

page 4

research
02/03/2022

Astronomical data organization, management and access in Scientific Data Lakes

The data volumes stored in telescope archives is constantly increasing d...
research
02/26/2019

Rucio - Scientific Data Management

Rucio is an open source software framework that provides scientific coll...
research
09/22/2021

ProvLet: A Provenance Management Service for Long Tail Microscopy Data

Provenance management must be present to enhance the overall security an...
research
03/07/2019

SAVIME: A Multidimensional System for the Analysis and Visualization of Simulation Data

Scientific applications produce a huge amount of data, which imposes ser...
research
03/14/2022

Deploying in-network caches in support of distributed scientific data sharing

The importance of intelligent data placement, management, and analysis h...
research
07/14/2022

Toward a Framework for Integrative, FAIR, and Reproducible Management of Data on the Dynamic Balance of Microbial Communities

The increasing volumes of data produced by high-throughput instruments c...
research
11/01/2022

On Kubernetes-aided Federated Database Systems

Cloud computing has made federated database systems (FDBS) significantly...

Please sign up or login with your details

Forgot password? Click here to reset