Semi-flat minima and saddle points by embedding neural networks to overparameterization

06/12/2019
by   Kenji Fukumizu, et al.
0

We theoretically study the landscape of the training error for neural networks in overparameterized cases. We consider three basic methods for embedding a network into a wider one with more hidden units, and discuss whether a minimum point of the narrower network gives a minimum or saddle point of the wider one. Our results show that the networks with smooth and ReLU activation have different partially flat landscapes around the embedded point. We also relate these results to a difference of their generalization abilities in overparameterized realization.

READ FULL TEXT
research
06/13/2018

Weight Initialization without Local Minima in Deep Nonlinear Neural Networks

In this paper, we propose a new weight initialization method called even...
research
03/02/2018

Essentially No Barriers in Neural Network Energy Landscape

Training neural networks involves finding minima of a high-dimensional n...
research
10/16/2019

Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation

We study layered neural networks of rectified linear units (ReLU) in a m...
research
11/30/2021

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

We prove a general Embedding Principle of loss landscape of deep neural ...
research
06/15/2023

The Split Matters: Flat Minima Methods for Improving the Performance of GNNs

When training a Neural Network, it is optimized using the available trai...
research
07/05/2019

Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape

The permutation symmetry of neurons in each layer of a deep neural netwo...
research
02/28/2022

Point Set Self-Embedding

This work presents an innovative method for point set self-embedding, th...

Please sign up or login with your details

Forgot password? Click here to reset