Discovery data topology with the closure structure. Theoretical and practical aspects

10/06/2020
by   Tatiana Makhalova, et al.
0

In this paper, we are revisiting pattern mining and especially itemset mining, which allows one to analyze binary datasets in searching for interesting and meaningful association rules and respective itemsets in an unsupervised way. While a summarization of a dataset based on a set of patterns does not provide a general and satisfying view over a dataset, we introduce a concise representation –the closure structure– based on closed itemsets and their minimum generators, for capturing the intrinsic content of a dataset. The closure structure allows one to understand the topology of the dataset in the whole and the inherent complexity of the data. We propose a formalization of the closure structure in terms of Formal Concept Analysis, which is well adapted to study this data topology. We present and demonstrate theoretical results, and as well, practical results using the GDPM algorithm. GDPM is rather unique in its functionality as it returns a characterization of the topology of a dataset in terms of complexity levels, highlighting the diversity and the distribution of the itemsets. Finally, a series of experiments shows how GDPM can be practically used and what can be expected from the output.

READ FULL TEXT
research
10/13/2022

Delta-Closure Structure for Studying Data Distribution

In this paper, we revisit pattern mining and study the distribution unde...
research
02/14/2021

Finite Confluences and Closed Pattern Mining

The purpose of this article is to propose and investigate a partial orde...
research
12/03/2010

Closed-set-based Discovery of Bases of Association Rules

The output of an association rule miner is often huge in practice. This ...
research
10/19/2018

Direct and Binary Direct Bases for One-set Updates of a Closure System

We introduce a concept of a binary-direct implicational basis and show t...
research
04/14/2019

Mining Closed Strict Episodes

Discovering patterns in a sequence is an important aspect of data mining...
research
04/16/2019

Mining Closed Episodes with Simultaneous Events

Sequential pattern discovery is a well-studied field in data mining. Epi...
research
04/04/2022

Explicit and Implicit Pattern Relation Analysis for Discovering Actionable Negative Sequences

Real-life events, behaviors and interactions produce sequential data. An...

Please sign up or login with your details

Forgot password? Click here to reset