Sparse (group) learning with Lipschitz loss functions: a unified analysis

10/20/2019
by   Antoine Dedieu, et al.
0

We study a family of sparse estimators defined as minimizers of some empirical Lipschitz loss function—which include hinge, logistic and quantile regression losses—with a convex, sparse or group-sparse regularization. In particular, we consider the L1-norm on the coefficients, its sorted Slope version, and the Group L1-L2 extension. First, we propose a theoretical framework which simultaneously derives new L2 estimation upper bounds for all three regularization schemes. For L1 and Slope regularizations, our bounds scale as (k^*/n) log(p/k^*)—n× p is the size of the design matrix and k^* the dimension of the theoretical loss minimizer β^*—matching the optimal minimax rate achieved for the least-squares case. For Group L1-L2 regularization, our bounds scale as (s^*/n) log( G / s^* ) + m^* / n—G is the total number of groups and m^* the number of coefficients in the s^* groups which contain β^*—and improve over the least-squares case. We additionally show that Group L1-L2 is superior to L1 and Slope when the signal is strongly group-sparse. Our bounds are achieved both in probability and in expectation, under common assumptions in the literature. Second, we propose an accelerated proximal algorithm which computes the convex estimators studied when the number of variables is of the order of 100,000. We compare the statistical performance of our estimators against standard baselines for settings where the signal is either sparse or group-sparse. Our experiments findings reveal (i) the good empirical performance of L1 and Slope for sparse binary classification problems, (ii) the superiority of Group L1-L2 regularization for group-sparse classification problems and (iii) the appealing properties of sparse quantile regression estimators for sparse regression problems with heteroscedastic noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2018

Error bounds for sparse classifiers in high-dimensions

We prove an L2 recovery bound for a family of sparse estimators defined ...
research
12/03/2020

Sample-efficient L0-L2 constrained structure learning of sparse Ising models

We consider the problem of learning the underlying graph of a sparse Isi...
research
05/09/2012

Group Sparse Priors for Covariance Estimation

Recently it has become popular to learn sparse Gaussian graphical models...
research
07/07/2021

Variable selection in convex quantile regression: L1-norm or L0-norm regularization?

The curse of dimensionality is a recognized challenge in nonparametric e...
research
05/20/2019

Detection of similar successive groups in a model with diverging number of variable groups

In this paper, a linear model with grouped explanatory variables is cons...
research
06/16/2021

Beyond Tikhonov: Faster Learning with Self-Concordant Losses via Iterative Regularization

The theory of spectral filtering is a remarkable tool to understand the ...
research
12/20/2018

Reducing Sampling Ratios and Increasing Number of Estimates Improve Bagging in Sparse Regression

Bagging, a powerful ensemble method from machine learning, improves the ...

Please sign up or login with your details

Forgot password? Click here to reset