SIM-SITU: A Framework for the Faithful Simulation of in-situ Workflows

12/30/2021
by   Valentin Honoré, et al.
0

The amount of data generated by numerical simulations in various scientific domains such as molecular dynamics, climate modeling, biology, or astrophysics, led to a fundamental redesign of application workflows. The throughput and the capacity of storage subsystems have not evolved as fast as the computing power in extreme-scale supercomputers. As a result, the classical post-hoc analysis of simulation outputs became highly inefficient. In-situ workflows have then emerged as a solution in which simulation and data analytics are intertwined through shared computing resources, thus lower latencies. Determining the best allocation, i.e., how many resources to allocate to each component of an in-situ workflow; and mapping, i.e., where and at which frequency to run the data analytics component, is a complex task whose performance assessment is crucial to the efficient execution of in-situ workflows. However, such a performance evaluation of different allocation and mapping strategies usually relies either on directly running them on the targeted execution environments, which can rapidly become extremely time-and resource-consuming, or on resorting to the simulation of simplified models of the components of an in-situ workflow, which can lack of realism. In both cases, the validity of the performance evaluation is limited. To address this issue, we introduce SIM-SITU, a framework for the faithful simulation of in-situ workflows. This framework builds on the SimGrid toolkit and benefits of several important features of this versatile simulation tool. We designed SIM-SITU to reflect the typical structure of in-situ workflows and thanks to its modular design, SIM-SITU has the necessary flexibility to easily and faithfully evaluate the behavior and performance of various allocation and mapping strategies for in-situ workflows. We illustrate the simulation capabilities of SIM-SITU on a Molecular Dynamics use case. We study the impact of different allocation and mapping strategies on performance and show how users can leverage SIM-SITU to determine interesting tradeoffs when designing their in-situ workflow.

READ FULL TEXT
research
08/19/2022

Co-scheduling Ensembles of In Situ Workflows

Molecular dynamics (MD) simulations are widely used to study large-scale...
research
10/27/2020

In-situ data analytics for highly scalable cloud modelling on Cray machines

MONC is a highly scalable modelling tool for the investigation of atmosp...
research
01/23/2018

Task-parallel Analysis of Molecular Dynamics Trajectories

Different frameworks for implementing parallel data analytics applicatio...
research
05/21/2020

Mapping Matters: Application Process Mapping on 3-D Processor Topologies

Applications' performance is influenced by the mapping of processes to c...
research
01/20/2023

Adaptive Resource Allocation for Workflow Containerization on Kubernetes

In a cloud-native era, the Kubernetes-based workflow engine enables work...
research
12/11/2020

DataVault: A Data Storage Infrastructure for the Einstein Toolkit

Data sharing is essential in the numerical simulations research. We intr...
research
01/03/2018

Rapid, concurrent and adaptive extreme scale binding free energy calculation

The recently demonstrated ability to perform accurate, precise and rapid...

Please sign up or login with your details

Forgot password? Click here to reset