AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

03/24/2020
by   Sebastiaan. P. Huber, et al.
0

The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (http://www.aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA's workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with any simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible.

READ FULL TEXT

page 6

page 15

research
07/17/2020

Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

Over the last two decades, the field of computational science has seen a...
research
02/26/2019

Workflow-Driven Distributed Machine Learning in CHASE-CI: A Cognitive Hardware and Software Ecosystem Community Infrastructure

The advances in data, computing and networking over the last two decades...
research
04/20/2022

Enabling Dynamic and Intelligent Workflows for HPC, Data Analytics, and AI Convergence

The evolution of High-Performance Computing (HPC) platforms enables the ...
research
07/30/2018

umd-verification: Automation of Software Validation for the EGI federated e-Infrastructure

Supporting e-Science in the EGI e-Infrastructure requires extensive and ...
research
04/27/2023

Developing Distributed High-performance Computing Capabilities of an Open Science Platform for Robust Epidemic Analysis

COVID-19 had an unprecedented impact on scientific collaboration. The pa...
research
12/13/2022

Automated Cache for Container Executables

Linux container technologies such as Docker and Singularity offer encaps...
research
08/13/2023

ensemblQueryR: fast, flexible and high-throughput querying of Ensembl LD API endpoints in R

We present ensemblQueryR, a package providing an R interface to the Ense...

Please sign up or login with your details

Forgot password? Click here to reset