An Entropic Relevance Measure for Stochastic Conformance Checking in Process Mining

by   Artem Polyvyanyy, et al.

Given an event log as a collection of recorded real-world process traces, process mining aims to automatically construct a process model that is both simple and provides a useful explanation of the traces. Conformance checking techniques are then employed to characterize and quantify commonalities and discrepancies between the log's traces and the candidate models. Recent approaches to conformance checking acknowledge that the elements being compared are inherently stochastic - for example, some traces occur frequently and others infrequently - and seek to incorporate this knowledge in their analyses. Here we present an entropic relevance measure for stochastic conformance checking, computed as the average number of bits required to compress each of the log's traces, based on the structure and information about relative likelihoods provided by the model. The measure penalizes traces from the event log not captured by the model and traces described by the model but absent in the event log, thus addressing both precision and recall quality criteria at the same time. We further show that entropic relevance is computable in time linear in the size of the log, and provide evaluation outcomes that demonstrate the feasibility of using the new approach in industrial settings.


Entropia: A Family of Entropy-Based Conformance Checking Measures for Process Mining

This paper presents a command-line tool, called Entropia, that implement...

Temporal Conformance Checking at Runtime based on Time-infused Process Models

Conformance checking quantifies the deviations between a set of traces i...

Conformance Checking for Trace Fragments Using Infix and Postfix Alignments

Conformance checking deals with collating modeled process behavior with ...

Bootstrapping Generalization of Process Models Discovered From Event Data

Process mining studies ways to derive value from process executions reco...

Uncertain Process Data with Probabilistic Knowledge: Problem Characterization and Challenges

Motivated by the abundance of uncertain event data from multiple sources...

Log Skeletons: A Classification Approach to Process Discovery

To test the effectiveness of process discovery algorithms, a Process Dis...

A Distance Measure for Privacy-preserving Process Mining based on Feature Learning

To enable process analysis based on an event log without compromising th...

Please sign up or login with your details

Forgot password? Click here to reset