Keeping Track of User Steering Actions in Dynamic Workflows

05/17/2019
by   Renan Souza, et al.
0

In long-lasting scientific workflow executions in HPC machines, computational scientists (the users in this work) often need to fine-tune several workflow parameters. These tunings are done through user steering actions that may significantly improve performance (e.g., reduce execution time) or improve the overall results. However, in executions that last for weeks, users can lose track of what has been adapted if the tunings are not properly registered. In this work, we build on provenance data management to address the problem of tracking online parameter fine-tuning in dynamic workflows steered by users. We propose a lightweight solution to capture and manage provenance of the steering actions online with negligible overhead. The resulting provenance database relates tuning data with data for domain, dataflow provenance, execution, and performance, and is available for analysis at runtime. We show how users may get a detailed view of the execution, providing insights to determine when and how to tune. We discuss the applicability of our solution in different domains and validate its ability to allow for online capture and analyses of parameter fine-tunings in a real workflow in the Oil and Gas industry. In this experiment, the user could determine which tuned parameters influenced simulation accuracy and performance. The observed overhead for keeping track of user steering actions at runtime is less than 1

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

05/11/2021

Distributed In-memory Data Management for Workflow Executions

Complex scientific experiments from various domains are typically modele...
09/30/2020

Workflow Provenance in the Lifecycle of Scientific Machine Learning

Machine Learning (ML) has already fundamentally changed several business...
10/10/2018

Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach

Many algorithms in workflow scheduling and resource provisioning rely on...
12/13/2020

Fine-Grained Lineage for Safer Notebook Interactions

Computational notebooks have emerged as the platform of choice for data ...
12/18/2019

Adaptive Scheduling for Efficient Execution of Dynamic Stream Workflows

Stream workflow application such as online anomaly detection or online t...
07/24/2021

SODA: A Semantics-Aware Optimization Framework for Data-Intensive Applications Using Hybrid Program Analysis

In the era of data explosion, a growing number of data-intensive computi...
07/06/2020

Guided Fine-Tuning for Large-Scale Material Transfer

We present a method to transfer the appearance of one or a few exemplar ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.