Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data

07/07/2022
by   Hongkang Li, et al.
0

This paper analyzes the convergence and generalization of training a one-hidden-layer neural network when the input features follow the Gaussian mixture model consisting of a finite number of Gaussian distributions. Assuming the labels are generated from a teacher model with an unknown ground truth weight, the learning problem is to estimate the underlying teacher model by minimizing a non-convex risk function over a student neural network. With a finite number of training samples, referred to the sample complexity, the iterations are proved to converge linearly to a critical point with guaranteed generalization error. In addition, for the first time, this paper characterizes the impact of the input distributions on the sample complexity and the learning rate.

READ FULL TEXT
research
10/16/2018

Learning Two-layer Neural Networks with Symmetric Inputs

We give a new algorithm for learning a two-layer neural network under a ...
research
10/12/2021

Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks

The lottery ticket hypothesis (LTH) states that learning on a properly p...
research
02/18/2018

Local Geometry of One-Hidden-Layer Neural Networks for Logistic Regression

We study the local geometry of a one-hidden-layer fully-connected neural...
research
07/11/2023

Fundamental limits of overparametrized shallow neural networks for supervised learning

We carry out an information-theoretical analysis of a two-layer neural n...
research
11/01/2017

Learning One-hidden-layer Neural Networks with Landscape Design

We consider the problem of learning a one-hidden-layer neural network: w...
research
06/25/2020

Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case

Although graph neural networks (GNNs) have made great progress recently ...
research
05/31/2019

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks

We analyze the dynamics of training deep ReLU networks and their implica...

Please sign up or login with your details

Forgot password? Click here to reset