WfBench: Automated Generation of Scientific Workflow Benchmarks

10/06/2022
by   Tainã Coleman, et al.
0

The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the deployment, monitoring, and optimization of workflow executions, many workflow systems have been developed over the past decade. There is a need for workflow benchmarks that can be used to evaluate the performance of workflow systems on current and future software stacks and hardware platforms. We present a generator of realistic workflow benchmark specifications that can be translated into benchmark code to be executed with current workflow systems. Our approach generates workflow tasks with arbitrary performance characteristics (CPU, memory, and I/O usage) and with realistic task dependency structures based on those seen in production workflows. We present experimental results that show that our approach generates benchmarks that are representative of production workflows, and conduct a case study to demonstrate the use and usefulness of our generated benchmarks to evaluate the performance of workflow systems under different configuration scenarios.

READ FULL TEXT
research
06/04/2020

Portability of Scientific Workflows in NGS Data Analysis: A Case Study

The analysis of next-generation sequencing (NGS) data requires complex c...
research
01/11/2018

BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments

Advances in sequencing techniques have led to exponential growth in biol...
research
05/01/2021

WfChef: Automated Generation of Accurate Scientific Workflow Generators

Scientific workflow applications have become mainstream and their automa...
research
09/24/2021

Extreme Scale Survey Simulation with Python Workflows

The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) wil...
research
06/09/2021

Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development

Scientific workflows are a cornerstone of modern scientific computing, a...
research
08/31/2020

Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool

Because of the limits input/output systems currently impose on high-perf...
research
02/14/2019

Theory-plus-code documentation of the DEPAM workflow for soundscape description

In the Big Data era, the community of PAM faces strong challenges, inclu...

Please sign up or login with your details

Forgot password? Click here to reset