A Guided FP-growth algorithm for multitude-targeted mining of big data

03/18/2018
by   Lior Shabtay, et al.
0

In this paper we present the GFP-growth (Guided FP-growth) algorithm, a novel method for multitude-targeted mining: finding the count of a given large list of itemsets in large data. The GFP-growth algorithm is designed to focus on the specific multitude itemsets of interest and optimizes the time and memory costs. We prove that the GFP-growth algorithm yields the exact frequency-counts for the required itemsets. We show that for a number of different problems, a solution can be devised which takes advantage of the efficient implementation of multitude-targeted mining for boosting the performance. In particular, we study in detail the problem of generating the minority-class rules from imbalanced data, a scenario that appears in many real-life domains such as medical applications, failure prediction, network and cyber security, and maintenance. We develop the Minority-Report Algorithm that uses the GFP-growth for boosting performance. We prove some theoretical properties of the Minority-Report Algorithm and demonstrate its performance gain using simulations and real data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2018

A Guided FP-growth algorithm for fast mining of frequent itemsets from big data

In this paper we present the GFP-growth (Guided FP-growth) algorithm, a ...
research
01/12/2019

Learning of High Dengue Incidence with Clustering and FP-Growth Algorithm using WHO Historical Data

This paper applies FP-Growth algorithm in mining fuzzy association rules...
research
10/30/2021

TargetUM: Targeted High-Utility Itemset Querying

Traditional high-utility itemset mining (HUIM) aims to determine all hig...
research
06/09/2022

Towards Target Sequential Rules

In many real-world applications, sequential rule mining (SRM) can provid...
research
02/06/2022

Memory Efficient Tries for Sequential Pattern Mining

The rapid and continuous growth of data has increased the need for scala...
research
04/18/2018

A Parallel/Distributed Algorithmic Framework for Mining All Quantitative Association Rules

We present QARMA, an efficient novel parallel algorithm for mining all Q...

Please sign up or login with your details

Forgot password? Click here to reset