A Generic Algorithm for Top-K On-Shelf Utility Mining

08/27/2022
by   Jiahui Chen, et al.
1

On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. Compared with traditional utility mining, OSUM can find more practical and meaningful patterns in real-life applications. However, there is a major drawback to traditional OSUM. For normal users, it is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. On one hand, if the threshold is set too high, the number of patterns would not be enough. On the other hand, if the threshold is set too low, too many patterns will be discovered and cause an unnecessary waste of time and memory consumption. To address this issue, the user usually directly specifies a parameter k, where only the top-k high relative utility itemsets would be considered. Therefore, in this paper, we propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns to solve this problem. TOIT applies a novel strategy to raise the minutil based on the on-shelf datasets. Besides, two novel upper-bound strategies named subtree utility and local utility are applied to prune the search space. By adopting the strategies mentioned above, the TOIT algorithm can narrow the search space as early as possible, improve the mining efficiency, and reduce the memory consumption, so it can obtain better performance than other algorithms. A series of experiments have been conducted on real datasets with different styles to compare the effects with the state-of-the-art KOSHU algorithm. The experimental results showed that TOIT outperforms KOSHU in both running time and memory consumption.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2022

HUSP-SP: Faster Utility Mining on Sequence Data

High-utility sequential pattern mining (HUSPM) has emerged as an importa...
research
12/25/2019

Utility Mining Across Multi-Sequences with Individualized Thresholds

Utility-oriented pattern mining has become an emerging topic since it ca...
research
06/28/2021

THUE: Discovering Top-K High Utility Episodes

Episode discovery from an event is a popular framework for data mining t...
research
11/26/2020

TKUS: Mining Top-K High-Utility Sequential Patterns

High-utility sequential pattern mining (HUSPM) has recently emerged as a...
research
11/26/2020

On-shelf Utility Mining of Sequence Data

Utility mining has emerged as an important and interesting topic owing t...
research
06/09/2022

Towards Target High-Utility Itemsets

For applied intelligence, utility-driven pattern discovery algorithms ca...
research
09/04/2018

A comparative study of top-k high utility itemset mining methods

High Utility Itemset (HUI) mining problem is one of the important proble...

Please sign up or login with your details

Forgot password? Click here to reset