Applying Process Mining on Scientific Workflows: a Case Study

07/06/2023
by   Zahra Sadeghibogar, et al.
0

Computer-based scientific experiments are becoming increasingly data-intensive. High-Performance Computing (HPC) clusters are ideal for executing large scientific experiment workflows. Executing large scientific workflows in an HPC cluster leads to complex flows of data and control within the system, which are difficult to analyze. This paper presents a case study where process mining is applied to logs extracted from SLURM-based HPC clusters, in order to document the running workflows and find the performance bottlenecks. The challenge lies in correlating the jobs recorded in the system to enable the application of mainstream process mining techniques. Users may submit jobs with explicit or implicit interdependencies, leading to the consideration of different event correlation techniques. We present a log extraction technique from SLURM clusters, completed with an experimental.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2018

Why do Users Kill HPC Jobs?

Given the cost of HPC clusters, making best use of them is crucial to im...
research
05/29/2019

Evaluation of pilot jobs for Apache Spark applications on HPC clusters

Big Data has become prominent throughout many scientific fields and, as ...
research
04/11/2022

Linking Scientific Instruments and HPC: Patterns, Technologies, Experiences

Powerful detectors at modern experimental facilities routinely collect d...
research
01/17/2021

Ten Simple Rules for Success with HPC, i.e. Responsibly BASHing that Linux Cluster

High-performance computing (HPC) clusters are widely used in-house at sc...
research
12/19/2022

Pseudonymization at Scale: OLCF's Summit Usage Data Case Study

The analysis of vast amounts of data and the processing of complex compu...
research
07/04/2022

Sea: A lightweight data-placement library for Big Data scientific computing

The recent influx of open scientific data has contributed to the transit...
research
05/20/2019

Scylla: A Mesos Framework for Container Based MPI Jobs

Open source cloud technologies provide a wide range of support for creat...

Please sign up or login with your details

Forgot password? Click here to reset