Keeping Track of User Steering Actions in Dynamic Workflows

by   Renan Souza, et al.

In long-lasting scientific workflow executions in HPC machines, computational scientists (the users in this work) often need to fine-tune several workflow parameters. These tunings are done through user steering actions that may significantly improve performance (e.g., reduce execution time) or improve the overall results. However, in executions that last for weeks, users can lose track of what has been adapted if the tunings are not properly registered. In this work, we build on provenance data management to address the problem of tracking online parameter fine-tuning in dynamic workflows steered by users. We propose a lightweight solution to capture and manage provenance of the steering actions online with negligible overhead. The resulting provenance database relates tuning data with data for domain, dataflow provenance, execution, and performance, and is available for analysis at runtime. We show how users may get a detailed view of the execution, providing insights to determine when and how to tune. We discuss the applicability of our solution in different domains and validate its ability to allow for online capture and analyses of parameter fine-tunings in a real workflow in the Oil and Gas industry. In this experiment, the user could determine which tuned parameters influenced simulation accuracy and performance. The observed overhead for keeping track of user steering actions at runtime is less than 1



There are no comments yet.



Distributed In-memory Data Management for Workflow Executions

Complex scientific experiments from various domains are typically modele...

Workflow Provenance in the Lifecycle of Scientific Machine Learning

Machine Learning (ML) has already fundamentally changed several business...

Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach

Many algorithms in workflow scheduling and resource provisioning rely on...

Fine-Grained Lineage for Safer Notebook Interactions

Computational notebooks have emerged as the platform of choice for data ...

Adaptive Scheduling for Efficient Execution of Dynamic Stream Workflows

Stream workflow application such as online anomaly detection or online t...

SODA: A Semantics-Aware Optimization Framework for Data-Intensive Applications Using Hybrid Program Analysis

In the era of data explosion, a growing number of data-intensive computi...

Guided Fine-Tuning for Large-Scale Material Transfer

We present a method to transfer the appearance of one or a few exemplar ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.