Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization

05/20/2022
by   Simone Bombari, et al.
0

The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks. A line of work has studied the NTK spectrum for two-layer and deep networks with at least a layer with Ω(N) neurons, N being the number of training samples. Furthermore, there is increasing evidence suggesting that deep networks with sub-linear layer widths are powerful memorizers and optimizers, as long as the number of parameters exceeds the number of samples. Thus, a natural open question is whether the NTK is well conditioned in such a challenging sub-linear setup. In this paper, we answer this question in the affirmative. Our key technical contribution is a lower bound on the smallest NTK eigenvalue for deep networks with the minimum possible over-parameterization: the number of parameters is roughly Ω(N) and, hence, the number of neurons is as little as Ω(√(N)). To showcase the applicability of our NTK bounds, we provide two results concerning memorization capacity and optimization guarantees for gradient descent training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2020

Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology

A recent line of research has provided convergence guarantees for gradie...
research
09/26/2019

Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently

It has been observed zhang2016understanding that deep neural networks ca...
research
12/21/2020

Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks

A recent line of work has analyzed the theoretical properties of deep ne...
research
07/19/2019

Representational Capacity of Deep Neural Networks -- A Computing Study

There is some theoretical evidence that deep neural networks with multip...
research
11/29/2019

A Reparameterization-Invariant Flatness Measure for Deep Neural Networks

The performance of deep neural networks is often attributed to their aut...
research
06/14/2021

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

It is well known that modern deep neural networks are powerful enough to...
research
02/18/2021

On Connectivity of Solutions in Deep Learning: The Role of Over-parameterization and Feature Quality

It has been empirically observed that, in deep neural networks, the solu...

Please sign up or login with your details

Forgot password? Click here to reset