Recovery Guarantees for One-hidden-layer Neural Networks

06/10/2017
by   Kai Zhong, et al.
0

In this paper, we consider regression problems with one-hidden-layer neural networks (1NNs). We distill some properties of activation functions that lead to local strong convexity in the neighborhood of the ground-truth parameters for the 1NN squared-loss objective. Most popular nonlinear activation functions satisfy the distilled properties, including rectified linear units (ReLUs), leaky ReLUs, squared ReLUs and sigmoids. For activation functions that are also smooth, we show local linear convergence guarantees of gradient descent under a resampling rule. For homogeneous activations, we show tensor methods are able to initialize the parameters to fall into the local strong convexity region. As a result, tensor initialization followed by gradient descent is guaranteed to recover the ground truth with sample complexity d ·(1/ϵ) ·poly(k,λ ) and computational complexity n· d ·poly(k,λ) for smooth homogeneous activations with high probability, where d is the dimension of the input, k (k≤ d) is the number of hidden nodes, λ is a conditioning property of the ground-truth parameter matrix between the input layer and the hidden layer, ϵ is the targeted precision and n is the number of samples. To the best of our knowledge, this is the first work that provides recovery guarantees for 1NNs with both sample complexity and computational complexity linear in the input dimension and logarithmic in the precision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2017

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels

In this paper, we consider parameter recovery for non-overlapping convol...
research
02/18/2018

Local Geometry of One-Hidden-Layer Neural Networks for Logistic Regression

We study the local geometry of a one-hidden-layer fully-connected neural...
research
11/12/2019

Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks

We study the sample complexity of learning one-hidden-layer convolutiona...
research
02/18/2018

Neural Networks with Finite Intrinsic Dimension have no Spurious Valleys

Neural networks provide a rich class of high-dimensional, non-convex opt...
research
09/29/2022

Restricted Strong Convexity of Deep Learning Models with Smooth Activations

We consider the problem of optimization of deep learning models with smo...
research
03/02/2020

Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees

Graph representation learning is a ubiquitous task in machine learning w...
research
10/15/2021

Gradient Descent on Infinitely Wide Neural Networks: Global Convergence and Generalization

Many supervised machine learning methods are naturally cast as optimizat...

Please sign up or login with your details

Forgot password? Click here to reset