The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization

06/14/2021
by   Daniel LeJeune, et al.
0

Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "η-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy that adapts to the weights in a monotonic way corresponds to an effective subquadratic regularization penalty, and therefore leads to sparse solutions. We obtain the effective penalties for several popular sparsification strategies, which are remarkably similar to classical penalties commonly used in sparse optimization. Considering variational dropout as a case study, we demonstrate similar empirical behavior between the adaptive dropout method and classical methods on the task of deep network sparsification, validating our theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2014

On the Inductive Bias of Dropout

Dropout is a simple but effective technique for learning in neural netwo...
research
09/09/2022

MaxMatch-Dropout: Subword Regularization for WordPiece

We present a subword regularization method for WordPiece, which uses a m...
research
04/13/2019

Shakeout: A New Approach to Regularized Deep Neural Network Training

Recent years have witnessed the success of deep neural networks in deali...
research
01/19/2017

Variational Dropout Sparsifies Deep Neural Networks

We explore a recently proposed Variational Dropout technique that provid...
research
04/06/2023

Spectral Gap Regularization of Neural Networks

We introduce Fiedler regularization, a novel approach for regularizing n...
research
04/21/2023

Effective Neural Network L_0 Regularization With BinMask

L_0 regularization of neural networks is a fundamental problem. In addit...
research
03/02/2020

Fiedler Regularization: Learning Neural Networks with Graph Sparsity

We introduce a novel regularization approach for deep learning that inco...

Please sign up or login with your details

Forgot password? Click here to reset