THUE: Discovering Top-K High Utility Episodes

06/28/2021
by   Shicheng Wan, et al.
0

Episode discovery from an event is a popular framework for data mining tasks and has many real-world applications. An episode is a partially ordered set of objects (e.g., item, node), and each object is associated with an event type. This episode can also be considered as a complex event sub-sequence. High-utility episode mining is an interesting utility-driven mining task in the real world. Traditional episode mining algorithms, by setting a threshold, usually return a huge episode that is neither intuitive nor saves time. In general, finding a suitable threshold in a pattern-mining algorithm is a trivial and time-consuming task. In this paper, we propose a novel algorithm, called Top-K High Utility Episode (THUE) mining within the complex event sequence, which redefines the previous mining task by obtaining the K highest episodes. We introduce several threshold-raising strategies and optimize the episode-weighted utilization upper bounds to speed up the mining process and effectively reduce the memory cost. Finally, the experimental results on both real-life and synthetic datasets reveal that the THUE algorithm can offer six to eight orders of magnitude running time performance improvement over the state-of-the-art algorithm and has low memory consumption.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 11

page 12

page 13

page 14

research
12/25/2019

Discovering High Utility Episodes in Sequences

Sequence data, e.g., complex event sequence, is more commonly seen than ...
research
08/27/2022

A Generic Algorithm for Top-K On-Shelf Utility Mining

On-shelf utility mining (OSUM) is an emerging research direction in data...
research
02/06/2022

Memory Efficient Tries for Sequential Pattern Mining

The rapid and continuous growth of data has increased the need for scala...
research
05/17/2019

Reference-Based Sequence Classification

Sequence classification is an important data mining task in many real wo...
research
10/30/2021

Utility-driven Mining of Contiguous Sequences

Recently, contiguous sequential pattern mining (CSPM) gained interest as...
research
11/26/2020

TKUS: Mining Top-K High-Utility Sequential Patterns

High-utility sequential pattern mining (HUSPM) has recently emerged as a...
research
06/28/2021

TOPIC: Top-k High-Utility Itemset Discovering

Utility-driven itemset mining is widely applied in many real-world scena...

Please sign up or login with your details

Forgot password? Click here to reset