Adversarial System Variant Approximation to Quantify Process Model Generalization
In process mining, process models are extracted from event logs using process discovery algorithms and are commonly assessed using multiple quality dimensions. While the metrics that measure the relationship of an extracted process model to its event log are well-studied, quantifying the level by which a process model can describe the unobserved behavior of its underlying system falls short in the literature. In this paper, a novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue. Sequence Generative Adversarial Networks are trained on the variants contained in an event log with the intention to approximate the underlying variant distribution of the system behavior. Unobserved realistic variants are sampled either directly from the Sequence Generative Adversarial Network or by leveraging the Metropolis-Hastings algorithm. The degree by which a process model relates to its underlying unknown system behavior is then quantified based on the realistic observed and estimated unobserved variants using established process model quality metrics. Significant performance improvements in revealing realistic unobserved variants are demonstrated in a controlled experiment on 15 ground truth systems. Additionally, the proposed methodology is experimentally tested and evaluated to quantify the generalization of 60 discovered process models with respect to their systems.
READ FULL TEXT