A framework for redescription set construction

06/13/2016
by   Matej Mihelčić, et al.
0

Redescription mining is a field of knowledge discovery that aims at finding different descriptions of similar subsets of instances in the data. These descriptions are represented as rules inferred from one or more disjoint sets of attributes, called views. As such, they support knowledge discovery process and help domain experts in formulating new hypotheses or constructing new knowledge bases and decision support systems. In contrast to previous approaches that typically create one smaller set of redescriptions satisfying a pre-defined set of constraints, we introduce a framework that creates large and heterogeneous redescription set from which user/expert can extract compact sets of differing properties, according to its own preferences. Construction of large and heterogeneous redescription set relies on CLUS-RM algorithm and a novel, conjunctive refinement procedure that facilitates generation of larger and more accurate redescription sets. The work also introduces the variability of redescription accuracy when missing values are present in the data, which significantly extends applicability of the method. Crucial part of the framework is the redescription set extraction based on heuristic multi-objective optimization procedure that allows user to define importance levels towards one or more redescription quality criteria. We provide both theoretical and empirical comparison of the novel framework against current state of the art redescription mining algorithms and show that it represents more efficient and versatile approach for mining redescriptions from data.

READ FULL TEXT
research
07/11/2012

Nugget Discovery with a Multi-objective Cultural Algorithm

Partial classification popularly known as nugget discovery comes under d...
research
06/16/2020

Discovering outstanding subgroup lists for numeric targets using MDL

The task of subgroup discovery (SD) is to find interpretable description...
research
04/01/2022

Separate and conquer heuristic allows robust mining of contrast sets from various types of data

Identifying differences between groups is one of the most important know...
research
07/15/2022

Knowledge Representation in Digital Agriculture: A Step Towards Standardised Model

In recent years, data science has evolved significantly. Data analysis a...
research
07/31/2021

Freezing Sub-Models During Incremental Process Discovery: Extended Version

Process discovery aims to learn a process model from observed process be...
research
12/23/2018

A Multi-Objective Anytime Rule Mining System to Ease Iterative Feedback from Domain Experts

Data extracted from software repositories is used intensively in Softwar...
research
04/20/2021

Inference of Common Multidimensional Equally-Distributed Attributes

Given two relations containing multiple measurements - possibly with unc...

Please sign up or login with your details

Forgot password? Click here to reset