Invariance of Weight Distributions in Rectified MLPs

11/24/2017
by   Russell Tsuchida, et al.
0

An interesting approach to analyzing and developing tools for neural networks that has received renewed attention is to examine the equivalent kernel of the neural network. This is based on the fact that a fully connected feedforward network with one hidden layer, a certain weight distribution, an activation function, and an infinite number of neurons is a mapping that can be viewed as a projection into a Hilbert space. We show that the equivalent kernel of an MLP with ReLU or Leaky ReLU activations for all rotationally-invariant weight distributions is the same, generalizing a previous result that required Gaussian weight distributions. We derive the equivalent kernel for these cases. In deep networks, the equivalent kernel approaches a pathological fixed point, which can be used to argue why training randomly initialized networks can be difficult. Our results also have implications for weight initialization and the level sets in neural network cost functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2020

Activation Functions: Do They Represent A Trade-Off Between Modular Nature of Neural Networks And Task Performance

Current research suggests that the key factors in designing neural netwo...
research
06/13/2022

Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks

Quantized neural networks have drawn a lot of attention as they reduce t...
research
07/26/2022

One Simple Trick to Fix Your Bayesian Neural Network

One of the most popular estimation methods in Bayesian neural networks (...
research
07/15/2020

From deep to Shallow: Equivalent Forms of Deep Networks in Reproducing Kernel Krein Space and Indefinite Support Vector Machines

In this paper we explore a connection between deep networks and learning...
research
05/15/2020

Learning the gravitational force law and other analytic functions

Large neural network models have been successful in learning functions o...
research
10/27/2018

Gradient-Free Learning Based on the Kernel and the Range Space

In this article, we show that solving the system of linear equations by ...
research
06/06/2021

On the Power of Shallow Learning

A deluge of recent work has explored equivalences between wide neural ne...

Please sign up or login with your details

Forgot password? Click here to reset