Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks

10/16/2018
by   Xiaodong Cui, et al.
0

We propose a population-based Evolutionary Stochastic Gradient Descent (ESGD) framework for optimizing deep neural networks. ESGD combines SGD and gradient-free evolutionary algorithms as complementary algorithms in one framework in which the optimization alternates between the SGD step and evolution step to improve the average fitness of the population. With a back-off strategy in the SGD step and an elitist strategy in the evolution step, it guarantees that the best fitness in the population will never degrade. In addition, individuals in the population optimized with various SGD-based optimizers using distinct hyper-parameters in the SGD step are considered as competing species in a coevolution setting such that the complementarity of the optimizers is also taken into account. The effectiveness of ESGD is demonstrated across multiple applications including speech recognition, image recognition and language modeling, using networks with a variety of deep architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2019

Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition

Evolutionary stochastic gradient descent (ESGD) was proposed as a popula...
research
12/21/2020

Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent

Deep neural networks (DNNs) have achieved remarkable success in computer...
research
12/18/2017

On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent

Because stochastic gradient descent (SGD) has shown promise optimizing n...
research
07/29/2019

Deep Gradient Boosting

Stochastic gradient descent (SGD) has been the dominant optimization met...
research
06/26/2018

Limited Evaluation Evolutionary Optimization of Large Neural Networks

Stochastic gradient descent is the most prevalent algorithm to train neu...
research
07/16/2018

Evolving Differentiable Gene Regulatory Networks

Over the past twenty years, artificial Gene Regulatory Networks (GRNs) h...
research
05/20/2023

Evolutionary Algorithms in the Light of SGD: Limit Equivalence, Minima Flatness, and Transfer Learning

Whenever applicable, the Stochastic Gradient Descent (SGD) has shown its...

Please sign up or login with your details

Forgot password? Click here to reset