A Deep Conditioning Treatment of Neural Networks

02/04/2020
by   Naman Agarwal, et al.
0

We study the role of depth in training randomly initialized overparameterized neural networks. We give the first general result showing that depth improves trainability of neural networks by improving the conditioning of certain kernel matrices of the input data. This result holds for arbitrary non-linear activation functions, and we provide a characterization of the improvement in conditioning as a function of the degree of non-linearity and the depth of the network. We provide versions of the result that hold for training just the top layer of the neural network, as well as for training all layers, via the neural tangent kernel. As applications of these general results, we provide a generalization of the results of Das et al. (2019) showing that learnability of deep random neural networks with arbitrary non-linear activations (under mild assumptions) degrades exponentially with depth. Additionally, we show how benign overfitting can occur in deep neural networks via the results of Bartlett et al. (2019b).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2020

Overcoming Overfitting and Large Weight Update Problem in Linear Rectifiers: Thresholded Exponential Rectified Linear Units

In past few years, linear rectified unit activation functions have shown...
research
10/17/2017

Spontaneous Symmetry Breaking in Neural Networks

We propose a framework to understand the unprecedented performance and r...
research
02/25/2020

Exploring Learning Dynamics of DNNs via Layerwise Conditioning Analysis

Conditioning analysis uncovers the landscape of optimization objective b...
research
10/10/2018

Response to Comment on "All-optical machine learning using diffractive deep neural networks"

In their Comment, Wei et al. (arXiv:1809.08360v1 [cs.LG]) claim that our...
research
11/26/2022

Why Neural Networks Work

We argue that many properties of fully-connected feedforward neural netw...
research
06/17/2021

Layer Folding: Neural Network Depth Reduction using Activation Linearization

Despite the increasing prevalence of deep neural networks, their applica...
research
02/22/2019

Capacity allocation through neural network layers

Capacity analysis has been recently introduced as a way to analyze how l...

Please sign up or login with your details

Forgot password? Click here to reset