Regularization by Misclassification in ReLU Neural Networks

11/03/2021
by   Elisabetta Cornacchia, et al.
0

We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability p to a random label (label smoothing being a close variant of this procedure). Our experiments demonstrate that label noise propels the network to a sparse solution in the following sense: for a typical input, a small fraction of neurons are active, and the firing pattern of the hidden layers is sparser. In fact, for some instances, an appropriate amount of label noise does not only sparsify the network but further reduces the test error. We then turn to the theoretical analysis of such sparsification mechanisms, focusing on the extremal case of p=1. We show that in this case, the network withers as anticipated from experiments, but surprisingly, in different ways that depend on the learning rate and the presence of bias, with either weights vanishing or neurons ceasing to fire.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2020

Network size and weights size for memorization with two-layers neural networks

In 1988, Eric B. Baum showed that two-layers neural networks with thresh...
research
01/04/2021

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

We consider a one-hidden-layer leaky ReLU network of arbitrary width tra...
research
07/24/2023

Rates of Approximation by ReLU Shallow Neural Networks

Neural networks activated by the rectified linear unit (ReLU) play a cen...
research
04/08/2021

A single gradient step finds adversarial examples on random two-layers neural networks

Daniely and Schacham recently showed that gradient descent finds adversa...
research
03/15/2019

Dying ReLU and Initialization: Theory and Numerical Examples

The dying ReLU refers to the problem when ReLU neurons become inactive a...
research
06/07/2022

Adversarial Reprogramming Revisited

Adversarial reprogramming, introduced by Elsayed, Goodfellow, and Sohl-D...
research
01/14/2021

Neural networks behave as hash encoders: An empirical study

The input space of a neural network with ReLU-like activations is partit...

Please sign up or login with your details

Forgot password? Click here to reset