An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

06/14/2021
βˆ™
by   Shashank Rajput, et al.
βˆ™
0
βˆ™

It is well known that modern deep neural networks are powerful enough to memorize datasets even when the labels have been randomized. Recently, Vershynin (2020) settled a long standing question by Baum (1988), proving that deep threshold networks can memorize n points in d dimensions using π’ͺ(e^1/Ξ΄^2+√(n)) neurons and π’ͺ(e^1/Ξ΄^2(d+√(n))+n) weights, where Ξ΄ is the minimum distance between the points. In this work, we improve the dependence on Ξ΄ from exponential to almost linear, proving that π’ͺ(1/Ξ΄+√(n)) neurons and π’ͺ(d/Ξ΄+n) weights are sufficient. Our construction uses Gaussian random weights only in the first layer, while all the subsequent layers use binary or integer weights. We also prove new lower bounds by connecting memorization in neural networks to the purely geometric problem of separating n points on a sphere using hyperplanes.

READ FULL TEXT
βˆ™ 06/04/2020

Network size and weights size for memorization with two-layers neural networks

In 1988, Eric B. Baum showed that two-layers neural networks with thresh...
βˆ™ 01/02/2019

The capacity of feedforward neural networks

A long standing open problem in the theory of neural networks is the dev...
βˆ™ 06/18/2022

Coin Flipping Neural Networks

We show that neural networks with access to randomness can outperform de...
βˆ™ 05/20/2022

Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization

The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provid...
βˆ™ 07/17/2019

On the geometry of solutions and on the capacity of multi-layer neural networks with ReLU activations

Rectified Linear Units (ReLU) have become the main model for the neural ...
βˆ™ 10/13/2021

Detecting Modularity in Deep Neural Networks

A neural network is modular to the extent that parts of its computationa...
βˆ™ 10/18/2021

Finding Everything within Random Binary Networks

A recent work by Ramanujan et al. (2020) provides significant empirical ...