Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns

12/22/2015
by   Roel Bertens, et al.
0

We study how to obtain concise descriptions of discrete multivariate sequential data. In particular, how to do so in terms of rich multivariate sequential patterns that can capture potentially highly interesting (cor)relations between sequences. To this end we allow our pattern language to span over the domains (alphabets) of all sequences, allow patterns to overlap temporally, as well as allow for gaps in their occurrences. We formalise our goal by the Minimum Description Length principle, by which our objective is to discover the set of patterns that provides the most succinct description of the data. To discover high-quality pattern sets directly from data, we introduce DITTO, a highly efficient algorithm that approximates the ideal result very well. Experiments show that DITTO correctly discovers the patterns planted in synthetic data. Moreover, it scales favourably with the length of the data, the number of attributes, the alphabet sizes. On real data, ranging from sensor networks to annotated text, DITTO discovers easily interpretable summaries that provide clear insight in both the univariate and multivariate structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2019

The Long and the Short of It: Summarising Event Sequences with Serial Episodes

An ideal outcome of pattern mining is a small set of informative pattern...
research
07/20/2023

Beep: Balancing Effectiveness and Efficiency when Finding Multivariate Patterns in Racket Sports

Modeling each hit as a multivariate event in racket sports and conductin...
research
10/18/2021

Label-Descriptive Patterns and their Application to Characterizing Classification Errors

State-of-the-art deep learning methods achieve human-like performance on...
research
01/27/2017

Efficiently Summarising Event Sequences with Rich Interleaving Patterns

Discovering the key structure of a database is one of the main goals of ...
research
04/27/2022

Discovering Representative Attribute-stars via Minimum Description Length

Graphs are a popular data type found in many domains. Numerous technique...
research
04/25/2019

Summarizing Data Succinctly with the Most Informative Itemsets

Knowledge discovery from data is an inherently iterative process. That i...

Please sign up or login with your details

Forgot password? Click here to reset