Co-scheduling Ensembles of In Situ Workflows

08/19/2022
by   Tu Mai Anh Do, et al.
0

Molecular dynamics (MD) simulations are widely used to study large-scale molecular systems. HPC systems are ideal platforms to run these studies, however, reaching the necessary simulation timescale to detect rare processes is challenging, even with modern supercomputers. To overcome the timescale limitation, the simulation of a long MD trajectory is replaced by multiple short-range simulations that are executed simultaneously in an ensemble of simulations. Analyses are usually co-scheduled with these simulations to efficiently process large volumes of data generated by the simulations at runtime, thanks to in situ techniques. Executing a workflow ensemble of simulations and their in situ analyses requires efficient co-scheduling strategies and sophisticated management of computational resources so that they are not slowing down each other. In this paper, we propose an efficient method to co-schedule simulations and in situ analyses such that the makespan of the workflow ensemble is minimized. We present a novel approach to allocate resources for a workflow ensemble under resource constraints by using a theoretical framework modeling the workflow ensemble's execution. We evaluate the proposed approach using an accurate simulator based on the WRENCH simulation framework on various workflow ensemble configurations. Results demonstrate the significance of co-scheduling simulations and in situ analyses that couple data together to benefit from data locality, in which inefficient scheduling decisions can lead up to a factor 30 slowdown in makespan.

READ FULL TEXT
research
05/11/2021

Distributed In-memory Data Management for Workflow Executions

Complex scientific experiments from various domains are typically modele...
research
12/30/2021

SIM-SITU: A Framework for the Faithful Simulation of in-situ Workflows

The amount of data generated by numerical simulations in various scienti...
research
03/27/2022

Novel ensemble collaboration method for dynamic scheduling problems

Dynamic scheduling problems are important optimisation problems with man...
research
04/12/2018

Adaptive Ensemble Biomolecular Simulations at Scale

Recent advances in both theory and methods have created opportunities to...
research
09/22/2022

Angular-based Edge Bundled Parallel Coordinates Plot for the Visual Analysis of Large Ensemble Simulation Data

With the continuous increase in the computational power and resources of...
research
02/04/2022

An integrated heterogeneous computing framework for ensemble simulations of laser-induced ignition

An integrated computational framework is introduced to study complex eng...
research
10/10/2017

Statistical Methods and Workflow for Analyzing Human Metabolomics Data

High-throughput metabolomics investigations, when conducted in large hum...

Please sign up or login with your details

Forgot password? Click here to reset