DeepAI AI Chat
Log In Sign Up

Learning sparse optimal rule fit by safe screening

by   Hiroki Kato, et al.
Nagoya Institute of Technology

In this paper, we consider linear prediction models in the form of a sparse linear combination of rules, where a rule is an indicator function defined over a hyperrectangle in the input space. Since the number of all possible rules generated from the training dataset becomes extremely large, it has been difficult to consider all of them when fitting a sparse model. In this paper, we propose Safe Optimal Rule Fit (SORF) as an approach to resolve this problem, which is formulated as a convex optimization problem with sparse regularization. The proposed SORF method utilizes the fact that the set of all possible rules can be represented as a tree. By extending a recently popularized convex optimization technique called safe screening, we develop a novel method for pruning the tree such that pruned nodes are guaranteed to be irrelevant to the prediction model. This approach allows us to efficiently learn a prediction model constructed from an exponentially large number of all possible rules. We demonstrate the usefulness of the proposed method by numerical experiments using several benchmark datasets.


page 1

page 2

page 3

page 4


Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares

A recently introduced technique for a sparse optimization problem called...

Safe Screening Rules for ℓ_0-Regression

We give safe screening rules to eliminate variables from regression with...

Safe Triplet Screening for Distance Metric Learning

We study safe screening for metric learning. Distance metric learning ca...

Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining

In this paper we study predictive pattern mining problems where the goal...

Safe Feature Pruning for Sparse High-Order Interaction Models

Taking into account high-order interactions among covariates is valuable...

Screening for a Reweighted Penalized Conditional Gradient Method

The conditional gradient method (CGM) is widely used in large-scale spar...

Strong rules for discarding predictors in lasso-type problems

We consider rules for discarding predictors in lasso regression and rela...