Grafting for Combinatorial Boolean Model using Frequent Itemset Mining

11/07/2017
by   Taito Lee, et al.
0

This paper introduces the combinatorial Boolean model (CBM), which is defined as the class of linear combinations of conjunctions of Boolean attributes. This paper addresses the issue of learning CBM from labeled data. CBM is of high knowledge interpretability but naïve learning of it requires exponentially large computation time with respect to data dimension and sample size. To overcome this computational difficulty, we propose an algorithm GRAB (GRAfting for Boolean datasets), which efficiently learns CBM within the L_1-regularized loss minimization framework. The key idea of GRAB is to reduce the loss minimization problem to the weighted frequent itemset mining, in which frequent patterns are efficiently computable. We employ benchmark datasets to empirically demonstrate that GRAB is effective in terms of computational efficiency, prediction accuracy and knowledge discovery.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2019

ALLSAT compressed with wildcards: Frequent Set Mining

Once the maximal frequent sets are known, the family of all frequent set...
research
07/09/2021

Redescription Model Mining

This paper introduces Redescription Model Mining, a novel approach to id...
research
06/15/2018

Mining Rank Data

The problem of frequent pattern mining has been studied quite extensivel...
research
04/21/2009

Ramp: Fast Frequent Itemset Mining with Efficient Bit-Vector Projection Technique

Mining frequent itemset using bit-vector representation approach is very...
research
02/15/2015

Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing

We present a novel algorithm, Westfall-Young light, for detecting patter...
research
06/02/2022

Approximate Network Motif Mining Via Graph Learning

Frequent and structurally related subgraphs, also known as network motif...
research
10/12/2017

Subjectively Interesting Subgroup Discovery on Real-valued Targets

Deriving insights from high-dimensional data is one of the core problems...

Please sign up or login with your details

Forgot password? Click here to reset