Towards Top-K Non-Overlapping Sequential Patterns

04/24/2023
by   Zefeng Chen, et al.
0

Sequential pattern mining (SPM) has excellent prospects and application spaces and has been widely used in different fields. The non-overlapping SPM, as one of the data mining techniques, has been used to discover patterns that have requirements for gap constraints in some specific mining tasks, such as bio-data mining. And for the non-overlapping sequential patterns with gap constraints, the Nettree structure has been proposed to efficiently compute the support of the patterns. For pattern mining, users usually need to consider the threshold of minimum support (minsup). This is especially difficult in the case of large databases. Although some existing algorithms can mine the top-k patterns, they are approximate algorithms with fixed lengths. In this paper, a precise algorithm for mining Top-k Non-Overlapping Sequential Patterns (TNOSP) is proposed. The top-k solution of SPM is an effective way to discover the most frequent non-overlapping sequential patterns without having to set the minsup. As a novel pattern mining algorithm, TNOSP can precisely search the top-k patterns of non-overlapping sequences with different gap constraints. We further propose a pruning strategy named Queue Meta Set Pruning (QMSP) to improve TNOSP's performance. TNOSP can reduce redundancy in non-overlapping sequential mining and has better performance in mining precise non-overlapping sequential patterns. The experimental results and comparisons on several datasets have shown that TNOSP outperformed the existing algorithms in terms of precision, efficiency, and scalability.

READ FULL TEXT
research
06/10/2023

TALENT: Targeted Mining of Non-overlapping Sequential Patterns

With the widespread application of efficient pattern mining algorithms, ...
research
06/04/2009

Mining Compressed Repetitive Gapped Sequential Patterns Efficiently

Mining frequent sequential patterns from sequence databases has been a c...
research
11/16/2020

Improving Scalability of Contrast Pattern Mining for Network Traffic Using Closed Patterns

Contrast pattern mining (CPM) aims to discover patterns whose support in...
research
06/02/2019

Statistically Significant Discriminative Patterns Searching

Discriminative pattern mining is an essential task of data mining. This ...
research
12/10/2022

Scaling pattern mining through non-overlapping variable partitioning

Biclustering algorithms play a central role in the biotechnological and ...
research
11/05/2018

A Study of Tourist Sequential Activity Patterns through Location Based Social Network (LBSN)

Sequential Pattern Mining is an important component in establishing patt...
research
07/20/2023

Beep: Balancing Effectiveness and Efficiency when Finding Multivariate Patterns in Racket Sports

Modeling each hit as a multivariate event in racket sports and conductin...

Please sign up or login with your details

Forgot password? Click here to reset