DeepAI AI Chat
Log In Sign Up

Towards Advanced Monitoring for Scientific Workflows

11/23/2022
by   Jonathan Bader, et al.
Berlin Institute of Technology (Technische Universität Berlin)
Technische Universität Darmstadt
Humboldt-Universität zu Berlin
Zuse Institute Berlin
0

Scientific workflows consist of thousands of highly parallelized tasks executed in a distributed environment involving many components. Automatic tracing and investigation of the components' and tasks' performance metrics, traces, and behavior are necessary to support the end user with a level of abstraction since the large amount of data cannot be analyzed manually. The execution and monitoring of scientific workflows involves many components, the cluster infrastructure, its resource manager, the workflow, and the workflow tasks. All components in such an execution environment access different monitoring metrics and provide metrics on different abstraction levels. The combination and analysis of observed metrics from different components and their interdependencies are still widely unregarded. We specify four different monitoring layers that can serve as an architectural blueprint for the monitoring responsibilities and the interactions of components in the scientific workflow execution context. We describe the different monitoring metrics subject to the four layers and how the layers interact. Finally, we examine five state-of-the-art scientific workflow management systems (SWMS) in order to assess which steps are needed to enable our four-layer-based approach.

READ FULL TEXT
10/17/2022

Macaw: The Machine Learning Magnetometer Calibration Workflow

In Earth Systems Science, many complex data pipelines combine different ...
06/04/2020

Portability of Scientific Workflows in NGS Data Analysis: A Case Study

The analysis of next-generation sequencing (NGS) data requires complex c...
11/15/2017

Modular Resource Centric Learning for Workflow Performance Prediction

Workflows provide an expressive programming model for fine-grained contr...
11/22/2022

Leveraging Reinforcement Learning for Task Resource Allocation in Scientific Workflows

Scientific workflows are designed as directed acyclic graphs (DAGs) and ...
05/20/2019

Custom Execution Environments with Containers in Pegasus-enabled Scientific Workflows

Science reproducibility is a cornerstone feature in scientific workflows...
10/18/2017

Toward Common Components for Open Workflow Systems

The role of scalable high-performance workflows and flexible workflow ma...
07/19/2018

Indexing Execution Patterns in Workflow Provenance Graphs through Generalized Trie Structures

Over the last years, scientific workflows have become mature enough to b...