Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening

11/05/2016
by   Frank S. He, et al.
0

We propose a novel training algorithm for reinforcement learning which combines the strength of deep Q-learning with a constrained optimization approach to tighten optimality and encourage faster reward propagation. Our novel technique makes deep reinforcement learning more practical by drastically reducing the training time. We evaluate the performance of our approach on the 49 games of the challenging Arcade Learning Environment, and report significant improvements in both training time and accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2019

TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

Recent work has identified that classification models implemented as neu...
research
03/20/2022

MicroRacer: a didactic environment for Deep Reinforcement Learning

MicroRacer is a simple, open source environment inspired by car racing e...
research
09/06/2023

On Reducing Undesirable Behavior in Deep Reinforcement Learning Models

Deep reinforcement learning (DRL) has proven extremely useful in a large...
research
07/23/2023

Shorter and faster than Sort3AlphaDev

Arising from: Mankowitz, D.J., Michi, A., Zhernov, A. et al. Faster sort...
research
04/03/2009

Time Hopping technique for faster reinforcement learning in simulations

This preprint has been withdrawn by the author for revision...
research
08/06/2017

An Information-Theoretic Optimality Principle for Deep Reinforcement Learning

In this paper, we methodologically address the problem of cumulative rew...
research
11/01/2021

Human-Level Control without Server-Grade Hardware

Deep Q-Network (DQN) marked a major milestone for reinforcement learning...

Please sign up or login with your details

Forgot password? Click here to reset