A Subsequence Interleaving Model for Sequential Pattern Mining

02/16/2016
by   Jaroslav Fowkes, et al.
0

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel subsequence interleaving model based on a probabilistic model of the sequence database, which allows us to search for the most compressing set of patterns without designing a specific encoding scheme. Our proposed algorithm is able to efficiently mine the most relevant sequential patterns and rank them using an associated measure of interestingness. The efficient inference in our model is a direct result of our use of a structural expectation-maximization framework, in which the expectation-step takes the form of a submodular optimization problem subject to a coverage constraint. We show on both synthetic and real world datasets that our model mines a set of sequential patterns with low spuriousness and redundancy, high interpretability and usefulness in real-world applications. Furthermore, we demonstrate that the quality of the patterns from our approach is comparable to, if not better than, existing state of the art sequential pattern mining algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2017

Mining Non-Redundant Sets of Generalizing Patterns from Sequence Databases

Sequential pattern mining techniques extract patterns corresponding to f...
research
06/26/2015

Skopus: Exact discovery of the most interesting sequential patterns under Leverage

This paper presents a framework for exact discovery of the most interest...
research
10/14/2015

A Bayesian Network Model for Interesting Itemsets

Mining itemsets that are the most interesting under a statistical model ...
research
11/21/2019

Vouw: Geometric Pattern Mining using the MDL Principle

We introduce geometric pattern mining, the problem of finding recurring ...
research
01/23/2022

Dichotomic Pattern Mining with Applications to Intent Prediction from Semi-Structured Clickstream Datasets

We introduce a pattern mining framework that operates on semi-structured...
research
09/20/2023

A survey on the semantics of sequential patterns with negation

A sequential pattern with negation, or negative sequential pattern, take...
research
10/18/2021

Label-Descriptive Patterns and their Application to Characterizing Classification Errors

State-of-the-art deep learning methods achieve human-like performance on...

Please sign up or login with your details

Forgot password? Click here to reset