Deep Abstract Q-Networks

10/02/2017
by   Melrose Roderick, et al.
0

We examine the problem of learning and planning on high-dimensional domains with long horizons and sparse rewards. Recent approaches have shown great successes in many Atari 2600 domains. However, domains with long horizons and sparse rewards, such as Montezuma's Revenge and Venture, remain challenging for existing methods. Methods using abstraction (Dietterich 2000; Sutton, Precup, and Singh 1999) have shown to be useful in tackling long-horizon problems. We combine recent techniques of deep reinforcement learning with existing model-based approaches using an expert-provided state abstraction. We construct toy domains that elucidate the problem of long horizons, sparse rewards and high-dimensional inputs, and show that our algorithm significantly outperforms previous methods on these domains. Our abstraction-based approach outperforms Deep Q-Networks (Mnih et al. 2015) on Montezuma's Revenge and Venture, and exhibits backtracking behavior that is absent from previous methods.

READ FULL TEXT

page 6

page 7

research
06/03/2022

Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks

Deep reinforcement learning has shown promise in discrete domains requir...
research
12/19/2013

Playing Atari with Deep Reinforcement Learning

We present the first deep learning model to successfully learn control p...
research
12/21/2019

Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards

While recent progress in deep reinforcement learning has enabled robots ...
research
03/13/2020

Sparse Graphical Memory for Robust Planning

To operate effectively in the real world, artificial agents must act fro...
research
09/06/2018

Model-Based Stabilisation of Deep Reinforcement Learning

Though successful in high-dimensional domains, deep reinforcement learni...
research
01/15/2021

Hierarchical Width-Based Planning and Learning

Width-based search methods have demonstrated state-of-the-art performanc...
research
04/06/2020

Uniform State Abstraction For Reinforcement Learning

Potential Based Reward Shaping combined with a potential function based ...

Please sign up or login with your details

Forgot password? Click here to reset