Towards Understanding the Importance of Noise in Training Neural Networks

09/07/2019
by   Mo Zhou, et al.
43

Numerous empirical evidence has corroborated that the noise plays a crucial rule in effective and efficient training of neural networks. The theory behind, however, is still largely unknown. This paper studies this fundamental problem through training a simple two-layer convolutional neural network model. Although training such a network requires solving a nonconvex optimization problem with a spurious local optimum and a global optimum, we prove that perturbed gradient descent and perturbed mini-batch stochastic gradient algorithms in conjunction with noise annealing is guaranteed to converge to a global optimum in polynomial time with arbitrary initialization. This implies that the noise enables the algorithm to efficiently escape from the spurious local optimum. Numerical experiments are provided to support our theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2021

Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization

Numerous empirical evidences have corroborated the importance of noise i...
research
09/10/2019

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Residual Network (ResNet) is undoubtedly a milestone in deep learning. R...
research
10/14/2021

Training Neural Networks for Solving 1-D Optimal Piecewise Linear Approximation

Recently, the interpretability of deep learning has attracted a lot of a...
research
05/23/2019

Parsimonious Deep Learning: A Differential Inclusion Approach with Global Convergence

Over-parameterization is ubiquitous nowadays in training neural networks...
research
02/07/2018

Learning One Convolutional Layer with Overlapping Patches

We give the first provably efficient algorithm for learning a one hidden...
research
02/27/2017

Dropping Convexity for More Efficient and Scalable Online Multiview Learning

Multiview representation learning is very popular for latent factor anal...
research
01/14/2022

A Kernel-Expanded Stochastic Neural Network

The deep neural network suffers from many fundamental issues in machine ...

Please sign up or login with your details

Forgot password? Click here to reset