Selecting Optimal Trace Clustering Pipelines with AutoML

09/01/2021
by   Sylvio Barbon Jr., et al.
12

Trace clustering has been extensively used to preprocess event logs. By grouping similar behavior, these techniques guide the identification of sub-logs, producing more understandable models and conformance analytics. Nevertheless, little attention has been posed to the relationship between event log properties and clustering quality. In this work, we propose an Automatic Machine Learning (AutoML) framework to recommend the most suitable pipeline for trace clustering given an event log, which encompasses the encoding method, clustering algorithm, and its hyperparameters. Our experiments were conducted using a thousand event logs, four encoding techniques, and three clustering methods. Results indicate that our framework sheds light on the trace clustering problem and can assist users in choosing the best pipeline considering their scenario.

READ FULL TEXT
research
01/05/2023

Trace Encoding in Process Mining: a survey and benchmarking

Encoding methods are employed across several process mining tasks, inclu...
research
04/02/2020

Efficient Conformance Checking using Alignment Computation with Tandem Repeats

Conformance checking encompasses a body of process mining techniques whi...
research
06/17/2016

Abducing Compliance of Incomplete Event Logs

The capability to store data about business processes execution in so-ca...
research
09/07/2020

Improving Problem Identification via Automated Log Clustering using Dimensionality Reduction

Goal: We consider the problem of automatically grouping logs of runs tha...
research
06/25/2022

Trace Recovery from Stochastically Known Logs

In this work we propose an algorithm for trace recovery from stochastica...
research
05/10/2020

Xanthus: Push-button Orchestration of Host Provenance Data Collection

Host-based anomaly detectors generate alarms by inspecting audit logs fo...
research
03/30/2018

Discovering Student Behavior Patterns from Event Logs: Preliminary Results on A Novel Probabilistic Latent Variable Model

Digital platforms enable the observation of learning behaviors through f...

Please sign up or login with your details

Forgot password? Click here to reset