Mollifying Networks

08/17/2016
by   Caglar Gulcehre, et al.
0

The optimization of deep neural networks can be more challenging than traditional convex optimization problems due to the highly non-convex nature of the loss function, e.g. it can involve pathological landscapes such as saddle-surfaces that can be difficult to escape for algorithms based on simple gradient descent. In this paper, we attack the problem of optimization of highly non-convex neural networks by starting with a smoothed -- or mollified -- objective function that gradually has a more non-convex energy landscape during the training. Our proposition is inspired by the recent studies in continuation methods: similar to curriculum methods, we begin learning an easier (possibly convex) objective function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, objective function. The complexity of the mollified networks is controlled by a single hyperparameter which is annealed during the training. We show improvements on various difficult optimization tasks and establish a relationship with recent works on continuation methods for neural networks and mollifiers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2020

Accelerated Algorithms for Convex and Non-Convex Optimization on Manifolds

We propose a general scheme for solving convex and non-convex optimizati...
research
06/25/2021

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

Although the optimization objectives for learning neural networks are hi...
research
11/13/2015

On the Quality of the Initial Basin in Overspecified Neural Networks

Deep learning, in the form of artificial neural networks, has achieved r...
research
10/24/2021

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Distributionally robust optimization (DRO) is a widely-used approach to ...
research
02/03/2020

A Relaxation Argument for Optimization in Neural Networks and Non-Convex Compressed Sensing

It has been observed in practical applications and in theoretical analys...
research
06/15/2017

Stochastic Training of Neural Networks via Successive Convex Approximations

This paper proposes a new family of algorithms for training neural netwo...
research
02/14/2023

Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training

A class of multi-level algorithms for unconstrained nonlinear optimizati...

Please sign up or login with your details

Forgot password? Click here to reset