Feature Space Saturation during Training

06/15/2020
by   Justin Shenk, et al.
11

We propose layer saturation - a simple, online-computable method for analyzing the information processing in neural networks. First, we show that a layer's output can be restricted to the eigenspace of its variance matrix without performance loss. We propose a computationally lightweight method for approximating the variance matrix during training. From the dimension of its lossless eigenspace we derive layer saturation - the ratio between the eigenspace dimension and layer width. We show that saturation seems to indicate which layers contribute to network performance. We demonstrate how to alter layer saturation in a neural network by changing network depth, filter sizes and input resolution. Furthermore, we show that well-chosen input resolution increases network performance by distributing the inference process more evenly across the network.

READ FULL TEXT

page 22

page 23

research
05/24/2022

Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable

Recently, neural networks have been shown to perform exceptionally well ...
research
03/24/2023

Online Learning for the Random Feature Model in the Student-Teacher Framework

Deep neural networks are widely used prediction algorithms whose perform...
research
06/09/2023

Hidden symmetries of ReLU networks

The parameter space for any fixed architecture of feedforward ReLU neura...
research
09/02/2018

On overcoming the Curse of Dimensionality in Neural Networks

Let H be a reproducing Kernel Hilbert space. For i=1,...,N, let x_i∈R^d ...
research
08/13/2023

Separable Gaussian Neural Networks: Structure, Analysis, and Function Approximations

The Gaussian-radial-basis function neural network (GRBFNN) has been a po...
research
07/13/2020

Probabilistic bounds on data sensitivity in deep rectifier networks

Neuron death is a complex phenomenon with implications for model trainab...
research
10/10/2022

Efficient NTK using Dimensionality Reduction

Recently, neural tangent kernel (NTK) has been used to explain the dynam...

Please sign up or login with your details

Forgot password? Click here to reset