DeepAI AI Chat
Log In Sign Up

Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns

by   Roel Bertens, et al.
Max Planck Society
Utrecht University

We study how to obtain concise descriptions of discrete multivariate sequential data. In particular, how to do so in terms of rich multivariate sequential patterns that can capture potentially highly interesting (cor)relations between sequences. To this end we allow our pattern language to span over the domains (alphabets) of all sequences, allow patterns to overlap temporally, as well as allow for gaps in their occurrences. We formalise our goal by the Minimum Description Length principle, by which our objective is to discover the set of patterns that provides the most succinct description of the data. To discover high-quality pattern sets directly from data, we introduce DITTO, a highly efficient algorithm that approximates the ideal result very well. Experiments show that DITTO correctly discovers the patterns planted in synthetic data. Moreover, it scales favourably with the length of the data, the number of attributes, the alphabet sizes. On real data, ranging from sensor networks to annotated text, DITTO discovers easily interpretable summaries that provide clear insight in both the univariate and multivariate structure.


page 1

page 2

page 3

page 4


The Long and the Short of It: Summarising Event Sequences with Serial Episodes

An ideal outcome of pattern mining is a small set of informative pattern...

Beep: Balancing Effectiveness and Efficiency when Finding Multivariate Patterns in Racket Sports

Modeling each hit as a multivariate event in racket sports and conductin...

Label-Descriptive Patterns and their Application to Characterizing Classification Errors

State-of-the-art deep learning methods achieve human-like performance on...

Efficiently Summarising Event Sequences with Rich Interleaving Patterns

Discovering the key structure of a database is one of the main goals of ...

Discovering Representative Attribute-stars via Minimum Description Length

Graphs are a popular data type found in many domains. Numerous technique...

Summarizing Data Succinctly with the Most Informative Itemsets

Knowledge discovery from data is an inherently iterative process. That i...