Flexibly Mining Better Subgroups

10/28/2015
by   Hoang-Vu Nguyen, et al.
0

In subgroup discovery, also known as supervised pattern mining, discovering high quality one-dimensional subgroups and refinements of these is a crucial task. For nominal attributes, this is relatively straightforward, as we can consider individual attribute values as binary features. For numerical attributes, the task is more challenging as individual numeric values are not reliable statistics. Instead, we can consider combinations of adjacent values, i.e. bins. Existing binning strategies, however, are not tailored for subgroup discovery. That is, they do not directly optimize for the quality of subgroups, therewith potentially degrading the mining result. To address this issue, we propose FLEXI. In short, with FLEXI we propose to use optimal binning to find high quality binary features for both numeric and ordinal attributes. We instantiate FLEXI with various quality measures and show how to achieve efficiency accordingly. Experiments on both synthetic and real-world data sets show that FLEXI outperforms state of the art with up to 25 times improvement in subgroup quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2015

Mining Combined Causes in Large Data Sets

In recent years, many methods have been developed for detecting causal r...
research
07/09/2021

Redescription Model Mining

This paper introduces Redescription Model Mining, a novel approach to id...
research
05/26/2023

Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach

We present a new task setting for attribute mining on e-commerce product...
research
10/12/2017

Subjectively Interesting Subgroup Discovery on Real-valued Targets

Deriving insights from high-dimensional data is one of the core problems...
research
02/07/2019

Probably the Best Itemsets

One of the main current challenges in itemset mining is to discover a sm...
research
08/01/2017

Enhancing the Input Representation: From Complexity to Simplicity

We introduce an efficient algorithm for mining informative combinations ...
research
11/24/2011

Revisiting Numerical Pattern Mining with Formal Concept Analysis

In this paper, we investigate the problem of mining numerical data in th...

Please sign up or login with your details

Forgot password? Click here to reset