Characterizing Transactional Databases for Frequent Itemset Mining

11/09/2020
by   Christian Lezcano, et al.
2

This paper presents a study of the characteristics of transactional databases used in frequent itemset mining. Such characterizations have typically been used to benchmark and understand the data mining algorithms working on these databases. The aim of our study is to give a picture of how diverse and representative these benchmarking databases are, both in general but also in the context of particular empirical studies found in the literature. Our proposed list of metrics contains many of the existing metrics found in the literature, as well as new ones. Our study shows that our list of metrics is able to capture much of the datasets' inner complexity and thus provides a good basis for the characterization of transactional datasets. Finally, we provide a set of representative datasets based on our characterization that may be used as a benchmark safely.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2009

Mining Compressed Repetitive Gapped Sequential Patterns Efficiently

Mining frequent sequential patterns from sequence databases has been a c...
research
11/07/2022

Using Set Covering to Generate Databases for Holistic Steganalysis

Within an operational framework, covers used by a steganographer are lik...
research
01/23/2019

Boosting Frequent Itemset Mining via Early Stopping Intersections

Mining frequent itemsets from a transaction database has emerged as a fu...
research
06/02/2022

Approximate Network Motif Mining Via Graph Learning

Frequent and structurally related subgraphs, also known as network motif...
research
12/27/2021

An efficient mining scheme for high utility itemsets

Knowledge discovery in databases aims at finding useful information, whi...
research
09/16/2021

Frequent Itemset Mining with Multiple Minimum Supports: a Constraint-based Approach

The problem of discovering frequent itemsets including rare ones has rec...
research
11/15/2018

Cybercasing 2.0: You Get What You Pay For

Under U.S. law, marketing databases exist under almost no legal restrict...

Please sign up or login with your details

Forgot password? Click here to reset