Adaptively Solving the Local-Minimum Problem for Deep Neural Networks

12/25/2020
by   Huachuan Wang, et al.
0

This paper aims to overcome a fundamental problem in the theory and application of deep neural networks (DNNs). We propose a method to solve the local minimum problem in training DNNs directly. Our method is based on the cross-entropy loss criterion's convexification by transforming the cross-entropy loss into a risk averting error (RAE) criterion. To alleviate numerical difficulties, a normalized RAE (NRAE) is employed. The convexity region of the cross-entropy loss expands as its risk sensitivity index (RSI) increases. Making the best use of the convexity region, our method starts training with an extensive RSI, gradually reduces it, and switches to the RAE as soon as the RAE is numerically feasible. After training converges, the resultant deep learning machine is expected to be inside the attraction basin of a global minimum of the cross-entropy loss. Numerical results are provided to show the effectiveness of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2020

Neural Collapse with Cross-Entropy Loss

We consider the variational problem of cross-entropy loss with n feature...
research
06/08/2015

Adaptive Normalized Risk-Averting Training For Deep Neural Networks

This paper proposes a set of new error criteria and learning approaches,...
research
12/13/2020

Demysifying Deep Neural Networks Through Interpretation: A Survey

Modern deep learning algorithms tend to optimize an objective metric, su...
research
09/27/2018

On the loss landscape of a class of deep neural networks with no bad local valleys

We identify a class of over-parameterized deep neural networks with stan...
research
07/25/2018

A Surprising Linear Relationship Predicts Test Performance in Deep Networks

Given two networks with the same training loss on a dataset, when would ...
research
11/23/2016

Tunable Sensitivity to Large Errors in Neural Network Training

When humans learn a new concept, they might ignore examples that they ca...
research
11/22/2016

Relaxed Earth Mover's Distances for Chain- and Tree-connected Spaces and their use as a Loss Function in Deep Learning

The Earth Mover's Distance (EMD) computes the optimal cost of transformi...

Please sign up or login with your details

Forgot password? Click here to reset