Local Geometry of One-Hidden-Layer Neural Networks for Logistic Regression

02/18/2018
by   Haoyu Fu, et al.
0

We study the local geometry of a one-hidden-layer fully-connected neural network where the training samples are generated from a multi-neuron logistic regression model. We prove that under Gaussian input, the empirical risk function employing quadratic loss exhibits strong convexity and smoothness uniformly in a local neighborhood of the ground truth, for a class of smooth activation functions satisfying certain properties, including sigmoid and tanh, as soon as the sample complexity is sufficiently large. This implies that if initialized in this neighborhood, gradient descent converges linearly to a critical point that is provably close to the ground truth without requiring a fresh set of samples at each iteration. This significantly improves upon prior results on learning shallow neural networks with multiple neurons. To the best of our knowledge, this is the first global convergence guarantee for one-hidden-layer neural networks using gradient descent over the empirical risk function without resampling at the near-optimal sampling and computational complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2017

Recovery Guarantees for One-hidden-layer Neural Networks

In this paper, we consider regression problems with one-hidden-layer neu...
research
07/04/2019

Learning One-hidden-layer neural networks via Provable Gradient Descent with Random Initialization

Although deep learning has shown its powerful performance in many applic...
research
11/01/2017

Learning One-hidden-layer Neural Networks with Landscape Design

We consider the problem of learning a one-hidden-layer neural network: w...
research
07/07/2022

Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data

This paper analyzes the convergence and generalization of training a one...
research
03/29/2019

A proof of convergence of multi-class logistic regression network

This paper revisits the special type of a neural network known under two...
research
02/18/2018

Neural Networks with Finite Intrinsic Dimension have no Spurious Valleys

Neural networks provide a rich class of high-dimensional, non-convex opt...
research
07/02/2020

Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach

Structural equation models (SEMs) are widely used in sciences, ranging f...

Please sign up or login with your details

Forgot password? Click here to reset