Separate and conquer heuristic allows robust mining of contrast sets from various types of data

04/01/2022
by   Adam Gudyś, et al.
0

Identifying differences between groups is one of the most important knowledge discovery problems. The procedure, also known as contrast sets mining, is applied in a wide range of areas like medicine, industry, or economics. In the paper we present RuleKit-CS, an algorithm for contrast set mining based on a sequential covering - a well established heuristic for decision rule induction. Multiple passes accompanied with an attribute penalization scheme allow generating contrast sets describing same examples with different attributes, unlike the standard sequential covering. The ability to identify contrast sets in regression and survival data sets, the feature not provided by the existing algorithms, further extends the usability of RuleKit-CS. Experiments on wide range of data sets confirmed RuleKit-CS to be a useful tool for discovering differences between defined groups. The algorithm is a part of the RuleKit suite available at GitHub under GNU AGPL 3 licence (https://github.com/adaa-polsl/RuleKit). Keywords: Contrast sets, Sequential covering, Rule induction, Regression, Survival, Knowledge discovery

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2018

GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

This article presents GuideR, a user-guided rule induction algorithm, wh...
research
08/02/2019

RuleKit: A Comprehensive Suite for Rule-Based Learning

Rule-based models are often used for data analysis as they combine inter...
research
06/09/2021

SCARI: Separate and Conquer Algorithm for Action Rules and Recommendations Induction

This article describes an action rule induction algorithm based on a seq...
research
06/13/2016

A framework for redescription set construction

Redescription mining is a field of knowledge discovery that aims at find...
research
08/29/2018

Rule induction for global explanation of trained models

Understanding the behavior of a trained network and finding explanations...
research
12/01/2017

Related families-based attribute reduction of dynamic covering information systems with variations of object sets

In practice, there are many dynamic covering decision information system...

Please sign up or login with your details

Forgot password? Click here to reset