On the Power of Shallow Learning

06/06/2021
by   James B. Simon, et al.
0

A deluge of recent work has explored equivalences between wide neural networks and kernel methods. A central theme is that one can analytically find the kernel corresponding to a given wide network architecture, but despite major implications for architecture design, no work to date has asked the converse question: given a kernel, can one find a network that realizes it? We affirmatively answer this question for fully-connected architectures, completely characterizing the space of achievable kernels. Furthermore, we give a surprising constructive proof that any kernel of any wide, deep, fully-connected net can also be achieved with a network with just one hidden layer and a specially-designed pointwise activation function. We experimentally verify our construction and demonstrate that, by just choosing the activation function, we can design a wide shallow network that mimics the generalization performance of any wide, deep, fully-connected network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2019

On the rate of convergence of fully connected very deep neural network regression estimates

Recent results in nonparametric regression show that deep learning, i.e....
research
09/16/2020

Activation Functions: Do They Represent A Trade-Off Between Modular Nature of Neural Networks And Task Performance

Current research suggests that the key factors in designing neural netwo...
research
09/30/2020

Deep Equals Shallow for ReLU Networks in Kernel Regimes

Deep networks are often considered to be more expressive than shallow on...
research
04/26/2017

The loss surface of deep and wide neural networks

While the optimization problem behind deep neural networks is highly non...
research
11/24/2017

Invariance of Weight Distributions in Rectified MLPs

An interesting approach to analyzing and developing tools for neural net...
research
06/20/2019

Clustering and Classification Networks

In this paper, we will describe a network architecture that demonstrates...
research
10/27/2018

Gradient-Free Learning Based on the Kernel and the Range Space

In this article, we show that solving the system of linear equations by ...

Please sign up or login with your details

Forgot password? Click here to reset