HUSP-SP: Faster Utility Mining on Sequence Data

12/29/2022
by   Chunkai Zhang, et al.
0

High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity. However, due to the combinatorial explosion of the search space when the HUSPM problem encounters a low utility threshold or large-scale data, it may be time-consuming and memory-costly to address the HUSPM problem. Several algorithms have been proposed for addressing this problem, but they still cost a lot in terms of running time and memory usage. In this paper, to further solve this problem efficiently, we design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely discovering high-utility sequential patterns with the seqPro structure (HUSP-SP). HUSP-SP utilizes the compact seq-array to store the necessary information in a sequence database. The seqPro structure is designed to efficiently calculate candidate patterns' utilities and upper bound values. Furthermore, a new upper bound on utility, namely tighter reduced sequence utility (TRSU) and two pruning strategies in search space, are utilized to improve the mining performance of HUSP-SP. Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability.

READ FULL TEXT
research
04/28/2019

Fast Utility Mining on Complex Sequences

High-utility sequential pattern mining is an emerging topic in the field...
research
04/16/2019

ProUM: Projection-based Utility Mining on Sequence Data

In recent decade, utility mining has attracted a great attention, but mo...
research
11/26/2020

On-shelf Utility Mining of Sequence Data

Utility mining has emerged as an important and interesting topic owing t...
research
11/26/2020

TKUS: Mining Top-K High-Utility Sequential Patterns

High-utility sequential pattern mining (HUSPM) has recently emerged as a...
research
08/27/2022

A Generic Algorithm for Top-K On-Shelf Utility Mining

On-shelf utility mining (OSUM) is an emerging research direction in data...
research
08/26/2022

Itemset Utility Maximization with Correlation Measure

As an important data mining technology, high utility itemset mining (HUI...
research
06/28/2021

TOPIC: Top-k High-Utility Itemset Discovering

Utility-driven itemset mining is widely applied in many real-world scena...

Please sign up or login with your details

Forgot password? Click here to reset