Quasi-Newton Optimization in Deep Q-Learning for Playing ATARI Games

11/06/2018
by   Jacob Rafati, et al.
0

Reinforcement Learning (RL) algorithms allow artificial agents to improve their selection of actions so as to increase rewarding experiences in their environments. The learning can become intractably slow as the state space of the environment grows. This has motivated methods to use deep artificial neural networks to learn the state representations. Deep reinforcement learning algorithms require solving a non-convex and nonlinear unconstrained optimization problem. Methods for solving the optimization problems in deep RL are restricted to the class of first-order algorithms, like stochastic gradient descent (SGD). The major drawback of the SGD methods is that they have the undesirable effect of not escaping saddle points. Furthermore, these methods require exhaustive trial and error to fine-tune many learning parameters. Using second derivative information can result in improved convergence properties, but computing the Hessian matrix for large-scale problems is not practical. Quasi-Newton methods, like SGD, require only first-order gradient information, but they can result in superlinear convergence, which makes them attractive alternatives. The limited-memory BFGS approach is one of the most popular quasi-Newton methods that construct positive definite Hessian approximations. In this paper, we introduce an efficient optimization method, based on the limited memory BFGS quasi-Newton method using line search strategy -- as an alternative to SGD methods. Our method bridges the disparity between first order methods and second order methods by continuing to use gradient information to calculate a low-rank Hessian approximations. We provide empirical results on variety of the classic ATARI 2600 games. Our results show a robust convergence with preferred generalization characteristics, as well as fast training time and no need for the experience replaying mechanism.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2019

Quasi-Newton Optimization Methods For Deep Learning Applications

Deep learning algorithms often require solving a highly non-linear and n...
research
07/01/2018

Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

Machine learning (ML) problems are often posed as highly nonlinear and n...
research
01/28/2019

Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample

We present two sampled quasi-Newton methods for deep learning: sampled L...
research
04/19/2022

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

In this work we describe an Adaptive Regularization using Cubics (ARC) m...
research
05/19/2016

A Multi-Batch L-BFGS Method for Machine Learning

The question of how to parallelize the stochastic gradient descent (SGD)...
research
09/19/2022

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

Bilevel optimization (BO) is useful for solving a variety of important m...
research
05/17/2021

Trust Region Method for Coupled Systems of PDE Solvers and Deep Neural Networks

Physics-informed machine learning and inverse modeling require the solut...

Please sign up or login with your details

Forgot password? Click here to reset