Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes

06/30/2017
by   Lei Wu, et al.
0

It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples. We systematically investigate the underlying reasons why deep neural networks often generalize well, and reveal the difference between the minima (with the same training error) that generalize well and those they don't. We show that it is the characteristics the landscape of the loss function that explains the good generalization capability. For the landscape of loss function for deep networks, the volume of basin of attraction of good minima dominates over that of poor minima, which guarantees optimization methods with random initialization to converge to good minima. We theoretically justify our findings through analyzing 2-layer neural networks; and show that the low-complexity solutions have a small norm of Hessian matrix with respect to model parameters. For deeper networks, extensive numerical evidence helps to support our arguments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2023

Structure and Gradient Dynamics Near Global Minima of Two-layer Neural Networks

Under mild assumptions, we investigate the structure of loss landscape o...
research
02/22/2021

Non-Convex Optimization with Spectral Radius Regularization

We develop a regularization method which finds flat minima during the tr...
research
03/15/2017

Sharp Minima Can Generalize For Deep Nets

Despite their overwhelming capacity to overfit, deep learning architectu...
research
06/07/2019

Understanding Generalization through Visualizations

The power of neural networks lies in their ability to generalize to unse...
research
09/25/2018

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

Deep learning has been immensely successful at a variety of tasks, rangi...
research
12/16/2021

Visualizing the Loss Landscape of Winning Lottery Tickets

The underlying loss landscapes of deep neural networks have a great impa...
research
01/31/2021

Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs

Training artificial neural networks requires the optimization of highly ...

Please sign up or login with your details

Forgot password? Click here to reset