On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition

02/20/2018
by   Marco Mondelli, et al.
0

We establish connections between the problem of learning a two-layers neural network with good generalization error and tensor decomposition. We consider a model with input x ∈ R^d, r hidden units with weights { w_i}_1< i < r and output y∈ R, i.e., y=∑_i=1^r σ(〈 x, w_i〉), where σ denotes the activation function. First, we show that, if we cannot learn the weights { w_i}_1< i < r accurately, then the neural network does not generalize well. More specifically, the generalization error is close to that of a trivial predictor with access only to the norm of the input. This result holds for any activation function, and it requires that the weights are roughly isotropic and the input distribution is Gaussian, which is a typical assumption in the theoretical literature. Then, we show that the problem of learning the weights { w_i}_1< i < r is at least as hard as the problem of tensor decomposition. This result holds for any input distribution and assumes that the activation function is a polynomial whose degree is related to the order of the tensor to be decomposed. By putting everything together, we prove that learning a two-layers neural network that generalizes well is at least as hard as tensor decomposition. It has been observed that neural network models with more parameters than training samples often generalize well, even if the problem is highly underdetermined. This means that the learning algorithm does not estimate the weights accurately and yet is able to yield a good generalization error. This paper shows that such a phenomenon cannot occur when the input distribution is Gaussian and the weights are roughly isotropic. We also provide numerical evidence supporting our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2018

End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition

In this paper we study the problem of learning the weights of a deep con...
research
06/15/2021

Predicting Unreliable Predictions by Shattering a Neural Network

Piecewise linear neural networks can be split into subfunctions, each wi...
research
09/30/2020

A law of robustness for two-layers neural networks

We initiate the study of the inherent tradeoffs between the size of a ne...
research
04/08/2022

Learning Polynomial Transformations

We consider the problem of learning high dimensional polynomial transfor...
research
03/29/2023

An Over-parameterized Exponential Regression

Over the past few years, there has been a significant amount of research...
research
05/17/2022

Sharp asymptotics on the compression of two-layer neural networks

In this paper, we study the compression of a target two-layer neural net...
research
05/09/2023

How Informative is the Approximation Error from Tensor Decomposition for Neural Network Compression?

Tensor decompositions have been successfully applied to compress neural ...

Please sign up or login with your details

Forgot password? Click here to reset