On the Margin Theory of Feedforward Neural Networks

10/12/2018
by   Colin Wei, et al.
0

Past works have shown that, somewhat surprisingly, over-parametrization can help generalization in neural networks. Towards explaining this phenomenon, we adopt a margin-based perspective. We establish: 1) for multi-layer feedforward relu networks, the global minimizer of a weakly-regularized cross-entropy loss has the maximum normalized margin among all networks, 2) as a result, increasing the over-parametrization improves the normalized margin and generalization error bounds for two-layer networks. In particular, an infinite-size neural network enjoys the best generalization guarantees. The typical infinite feature methods are kernel methods; we compare the neural net margin with that of kernel methods and construct natural instances where kernel methods have much weaker generalization guarantees. We validate this gap between the two approaches empirically. Finally, this infinite-neuron viewpoint is also fruitful for analyzing optimization. We show that a perturbed gradient flow on infinite-size networks finds a global optimizer in polynomial time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2019

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

Recent works on implicit regularization have shown that gradient descent...
research
02/11/2020

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Neural networks trained to minimize the logistic (a.k.a. cross-entropy) ...
research
09/28/2018

Predicting the Generalization Gap in Deep Networks with Margin Distributions

As shown in recent research, deep neural networks can perfectly fit rand...
research
10/09/2019

Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin

For linear classifiers, the relationship between (normalized) output mar...
research
06/12/2019

Decoupling Gating from Linearity

ReLU neural-networks have been in the focus of many recent theoretical w...
research
10/06/2021

On Margin Maximization in Linear and ReLU Networks

The implicit bias of neural networks has been extensively studied in rec...
research
10/08/2018

On Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics

Margin enlargement over training data has been an important strategy sin...

Please sign up or login with your details

Forgot password? Click here to reset