Implicit Regularization for Group Sparsity

01/29/2023
by   Jiangyuan Li, et al.
0

We study the implicit regularization of gradient descent towards structured sparsity via a novel neural reparameterization, which we call a diagonally grouped linear neural network. We show the following intriguing property of our reparameterization: gradient descent over the squared regression loss, without any explicit regularization, biases towards solutions with a group sparsity structure. In contrast to many existing works in understanding implicit regularization, we prove that our training trajectory cannot be simulated by mirror descent. We analyze the gradient dynamics of the corresponding regression problem in the general noise setting and obtain minimax-optimal error rates. Compared to existing bounds for implicit sparse regularization using diagonal linear networks, our analysis with the new reparameterization shows improved sample complexity. In the degenerate case of size-one groups, our approach gives rise to a new algorithm for sparse linear regression. Finally, we demonstrate the efficacy of our approach with several numerical experiments.

READ FULL TEXT
research
09/23/2020

Implicit Gradient Regularization

Gradient descent can be surprisingly good at optimizing deep neural netw...
research
02/01/2023

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

In deep learning, often the training process finds an interpolator (a so...
research
02/23/2023

Sharpness-Aware Minimization: An Implicit Regularization Perspective

Sharpness-Aware Minimization (SAM) is a recent optimization framework ai...
research
07/10/2019

Sparse Unit-Sum Regression

This paper considers sparsity in linear regression under the restriction...
research
04/29/2022

Implicit Regularization Properties of Variance Reduced Stochastic Mirror Descent

In machine learning and statistical data analysis, we often run into obj...
research
03/22/2019

Implicit Regularization via Hadamard Product Over-Parametrization in High-Dimensional Linear Regression

We consider Hadamard product parametrization as a change-of-variable (ov...
research
08/31/2022

Incremental Learning in Diagonal Linear Networks

Diagonal linear networks (DLNs) are a toy simplification of artificial n...

Please sign up or login with your details

Forgot password? Click here to reset