Free-rider Episode Screening via Dual Partition Model

05/19/2018
by   Xiang Ao, et al.
0

One of the drawbacks of frequent episode mining is that overwhelmingly many of the discovered patterns are redundant. Free-rider episode, as a typical example, consists of a real pattern doped with some additional noise events. Because of the possible high support of the inside noise events, such free-rider episodes may have abnormally high support that they cannot be filtered by frequency based framework. An effective technique for filtering free-rider episodes is using a partition model to divide an episode into two consecutive subepisodes and comparing the observed support of such episode with its expected support under the assumption that these two subepisodes occur independently. In this paper, we take more complex subepisodes into consideration and develop a novel partition model named EDP for free-rider episode filtering from a given set of episodes. It combines (1) a dual partition strategy which divides an episode to an underlying real pattern and potential noises; (2) a novel definition of the expected support of a free-rider episode based on the proposed partition strategy. We can deem the episode interesting if the observed support is substantially higher than the expected support estimated by our model. The experiments on synthetic and real-world datasets demonstrate EDP can effectively filter free-rider episodes compared with existing state-of-the-arts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2019

Ranking Episodes using a Partition Model

One of the biggest setbacks in traditional frequent pattern mining is th...
research
06/26/2015

Skopus: Exact discovery of the most interesting sequential patterns under Leverage

This paper presents a framework for exact discovery of the most interest...
research
04/16/2019

Mining Closed Episodes with Simultaneous Events

Sequential pattern discovery is a well-studied field in data mining. Epi...
research
12/19/2019

FIBS: A Generic Framework for Classifying Interval-based Temporal Sequences

We study the problem of classification of interval-based temporal sequen...
research
07/13/2022

Multiple Kernel Clustering with Dual Noise Minimization

Clustering is a representative unsupervised method widely applied in mul...
research
04/15/2019

Discovering Episodes with Compact Minimal Windows

Discovering the most interesting patterns is the key problem in the fiel...
research
05/23/2016

Stochastic Patching Process

Stochastic partition models tailor a product space into a number of rect...

Please sign up or login with your details

Forgot password? Click here to reset