Memory capacity of two layer neural networks with smooth activations

08/03/2023
by   Liam Madden, et al.
0

Determining the memory capacity of two-layer neural networks with m hidden neurons and input dimension d (i.e., md+m total trainable parameters), which refers to the largest size of general data the network can memorize, is a fundamental machine-learning question. For non-polynomial real analytic activation functions, such as sigmoids and smoothed rectified linear units (smoothed ReLUs), we establish a lower bound of md/2 and optimality up to a factor of approximately 2. Analogous prior results were limited to Heaviside and ReLU activations, with results for smooth activations suffering from logarithmic factors and requiring random data. To analyze the memory capacity, we examine the rank of the network's Jacobian by computing the rank of matrices involving both Hadamard powers and the Khati-Rao product. Our computation extends classical linear algebraic facts about the rank of Hadamard powers. Overall, our approach differs from previous works on memory capacity and holds promise for extending to deeper models and other architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2020

Memory capacity of neural networks with threshold and ReLU activations

Overwhelming theoretical and empirical evidence shows that mildly overpa...
research
11/15/2021

Neural networks with linear threshold activations: structure and algorithms

In this article we present new results on neural networks with linear th...
research
06/11/2020

On the asymptotics of wide networks with polynomial activations

We consider an existing conjecture addressing the asymptotic behavior of...
research
04/15/2020

A function space analysis of finite neural networks with insights from sampling theory

This work suggests using sampling theory to analyze the function space r...
research
01/16/2018

Empirical Explorations in Training Networks with Discrete Activations

We present extensive experiments training and testing hidden units in de...
research
02/01/2020

A Corrective View of Neural Networks: Representation, Memorization and Learning

We develop a corrective mechanism for neural network approximation: the ...
research
11/04/2021

Rate of Convergence of Polynomial Networks to Gaussian Processes

We examine one-hidden-layer neural networks with random weights. It is w...

Please sign up or login with your details

Forgot password? Click here to reset