Learning Two Layer Rectified Neural Networks in Polynomial Time

11/05/2018
by   Ainesh Bakshi, et al.
0

Consider the following fundamental learning problem: given input examples x ∈R^d and their vector-valued labels, as defined by an underlying generative neural network, recover the weight matrices of this network. We consider two-layer networks, mapping R^d to R^m, with k non-linear activation units f(·), where f(x) = {x , 0} is the ReLU. Such a network is specified by two weight matrices, U^* ∈R^m × k, V^* ∈R^k × d, such that the label of an example x ∈R^d is given by U^* f(V^* x), where f(·) is applied coordinate-wise. Given n samples as a matrix X∈R^d × n and the (possibly noisy) labels U^* f(V^* X) + E of the network on these samples, where E is a noise matrix, our goal is to recover the weight matrices U^* and V^*. In this work, we develop algorithms and hardness results under varying assumptions on the input and noise. Although the problem is NP-hard even for k=2, by assuming Gaussian marginals over the input X we are able to develop polynomial time algorithms for the approximate recovery of U^* and V^*. Perhaps surprisingly, in the noiseless case our algorithms recover U^*,V^* exactly, i.e., with no error. To the best of the our knowledge, this is the first algorithm to accomplish exact recovery. For the noisy case, we give the first polynomial time algorithm that approximately recovers the weights in the presence of mean-zero noise E. Our algorithms generalize to a larger class of rectified activation functions, f(x) = 0 when x≤ 0, and f(x) > 0 otherwise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2020

Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks

We study the problem of PAC learning one-hidden-layer ReLU networks with...
research
03/12/2018

Representation Learning and Recovery in the ReLU Model

Rectified linear units, or ReLUs, have become the preferred activation f...
research
04/16/2022

Polynomial-time sparse measure recovery

How to recover a probability measure with sparse support from particular...
research
06/16/2020

Quantitative Group Testing and the rank of random matrices

Given a random Bernoulli matrix A∈{0,1}^m× n, an integer 0< k < n and th...
research
03/21/2019

Learning Two layer Networks with Multinomial Activation and High Thresholds

Giving provable guarantees for learning neural networks is a core challe...
research
02/19/2020

Span Recovery for Deep Neural Networks with Applications to Input Obfuscation

The tremendous success of deep neural networks has motivated the need to...
research
04/13/2022

Linear Programs with Polynomial Coefficients and Applications to 1D Cellular Automata

Given a matrix A and vector b with polynomial entries in d real variable...

Please sign up or login with your details

Forgot password? Click here to reset