Finite sample expressive power of small-width ReLU networks

10/17/2018
by   Chulhee Yun, et al.
12

We study universal finite sample expressivity of neural networks, defined as the capability to perfectly memorize arbitrary datasets. For scalar outputs, existing results require a hidden layer as wide as N to memorize N data points. In contrast, we prove that a 3-layer (2-hidden-layer) ReLU network with 4 √(N) hidden nodes can perfectly fit any arbitrary dataset. For K-class classification, we prove that a 4-layer ReLU network with 4 √(N) + 4K hidden neurons can memorize arbitrary datasets. For example, a 4-layer ReLU network with only 8,000 hidden nodes can memorize datasets with N = 1M and K = 1k (e.g., ImageNet). Our results show that even small networks already have tremendous overfitting capability, admitting zero empirical risk for any dataset. We also extend our results to deeper and narrower networks, and prove converse results showing necessity of Ω(N) parameters for shallow networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2021

Landscape analysis for shallow ReLU neural networks: complete classification of critical points for affine target functions

In this paper, we analyze the landscape of the true loss of a ReLU neura...
research
06/09/2023

Hidden symmetries of ReLU networks

The parameter space for any fixed architecture of feedforward ReLU neura...
research
05/31/2023

On the Expressive Power of Neural Networks

In 1989 George Cybenko proved in a landmark paper that wide shallow neur...
research
06/20/2018

Learning ReLU Networks via Alternating Minimization

We propose and analyze a new family of algorithms for training neural ne...
research
06/12/2019

Decoupling Gating from Linearity

ReLU neural-networks have been in the focus of many recent theoretical w...
research
06/10/2023

Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs

We prove that, for the fundamental regression task of learning a single ...
research
11/08/2022

Finite Sample Identification of Wide Shallow Neural Networks with Biases

Artificial neural networks are functions depending on a finite number of...

Please sign up or login with your details

Forgot password? Click here to reset