Implementing the Deep Q-Network

11/20/2017
by   Melrose Roderick, et al.
0

The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research. However, replicating results for complex systems is often challenging since original scientific publications are not always able to describe in detail every important parameter setting and software engineering solution. In this paper, we present results from our work reproducing the results of the DQN paper. We highlight key areas in the implementation that were not covered in great detail in the original paper to make it easier for researchers to replicate these results, including termination conditions and gradient descent algorithms. Finally, we discuss methods for improving the computational performance and provide our own implementation that is designed to work with a range of domains, and not just the original Arcade Learning Environment [Bellemare et al., 2013].

READ FULL TEXT
research
08/16/2021

Identifying and Exploiting Structures for Reliable Deep Learning

Deep learning research has recently witnessed an impressively fast-paced...
research
12/15/2018

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Recent successes of value-based multi-agent deep reinforcement learning ...
research
10/18/2019

First-Order Preconditioning via Hypergradient Descent

Standard gradient descent methods are susceptible to a range of issues t...
research
06/13/2018

A Flexible Convolutional Solver with Application to Photorealistic Style Transfer

We propose a new flexible deep convolutional neural network (convnet) to...
research
01/26/2020

Reproducibility Challenge NeurIPS 2019 Report on "Competitive Gradient Descent"

This is a report for reproducibility challenge of NeurlIPS 2019 on the p...
research
01/28/2023

Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning

This work describes in detail how to learn and solve the Rubik's cube ga...
research
10/19/2018

Supervising strong learners by amplifying weak experts

Many real world learning tasks involve complex or hard-to-specify object...

Please sign up or login with your details

Forgot password? Click here to reset