Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks

03/13/2017
by   Nanyang Ye, et al.
0

Minimizing non-convex and high-dimensional objective functions is challenging, especially when training modern deep neural networks. In this paper, a novel approach is proposed which divides the training process into two consecutive phases to obtain better generalization performance: Bayesian sampling and stochastic optimization. The first phase is to explore the energy landscape and to capture the "fat" modes; and the second one is to fine-tune the parameter learned from the first phase. In the Bayesian learning phase, we apply continuous tempering and stochastic approximation into the Langevin dynamics to create an efficient and effective sampler, in which the temperature is adjusted automatically according to the designed "temperature dynamics". These strategies can overcome the challenge of early trapping into bad local minima and have achieved remarkable improvements in various types of neural networks as shown in our theoretical analysis and empirical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2019

Efficient approximation of high-dimensional functions with deep neural networks

In this paper, we develop an approximation theory for deep neural networ...
research
05/29/2020

CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing

Deep learning applications require optimization of nonconvex objective f...
research
09/22/2020

Anomalous diffusion dynamics of learning in deep neural networks

Learning in deep neural networks (DNNs) is implemented through minimizin...
research
05/13/2023

Information Bottleneck Analysis of Deep Neural Networks via Lossy Compression

The Information Bottleneck (IB) principle offers an information-theoreti...
research
09/26/2019

GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks

Current training methods for deep neural networks boil down to very high...
research
06/20/2021

Better Training using Weight-Constrained Stochastic Dynamics

We employ constraints to control the parameter space of deep neural netw...

Please sign up or login with your details

Forgot password? Click here to reset