The capacity of feedforward neural networks

01/02/2019
by   Pierre Baldi, et al.
0

A long standing open problem in the theory of neural networks is the development of quantitative methods to estimate and compare the capabilities of different architectures. Here we define the capacity of an architecture by the binary logarithm of the number of functions it can compute, as the synaptic weights are varied. The capacity is an upper bound on the number of bits that can be "communicated" from the training data to the architecture over the learning channel. We study the capacity of layered, fully-connected, architectures of linear threshold neurons with L layers of size n_1,n_2, ..., n_L and show that in essence the capacity is given by a cubic polynomial in the layer sizes: C(n_1,..., n_L)=∑_k=1^L-1(n_1,...,n_k)n_kn_k+1. In proving the main result, we also develop new techniques (multiplexing, enrichment, and stacking) as well as new bounds on the capacity of finite sets. We use the main result to identify architectures with maximal or minimal capacity under a number of natural constraints. This leads to the notion of structural regularization for deep architectures. While in general, everything else being equal, shallow networks compute more functions than deep networks, the functions computed by deep networks are more regular and "interesting".

READ FULL TEXT
research
06/14/2021

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

It is well known that modern deep neural networks are powerful enough to...
research
03/06/2017

On the Expressive Power of Overlapping Architectures of Deep Learning

Expressive efficiency refers to the relation between two architectures A...
research
12/04/2022

Understanding Sinusoidal Neural Networks

In this work, we investigate the representation capacity of multilayer p...
research
11/10/2016

Computing threshold functions using dendrites

Neurons, modeled as linear threshold unit (LTU), can in theory compute a...
research
10/27/2021

Reed-Muller Codes Achieve Capacity on BMS Channels

This paper considers the performance of long Reed-Muller (RM) codes tran...
research
02/22/2019

Capacity allocation through neural network layers

Capacity analysis has been recently introduced as a way to analyze how l...
research
02/16/2022

On Measuring Excess Capacity in Neural Networks

We study the excess capacity of deep networks in the context of supervis...

Please sign up or login with your details

Forgot password? Click here to reset