Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

03/14/2016
by   David Abel, et al.
0

High-dimensional observations and complex real-world dynamics present major challenges in reinforcement learning for both function approximation and exploration. We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on Q-function residuals. And second, we propose an exploration strategy inspired by the principles of state abstraction and information acquisition under uncertainty. We demonstrate the empirical effectiveness of these techniques, first, as a preliminary check, on two standard tasks (Blackjack and n-Chain), and then on two much larger and more realistic tasks with high-dimensional observation spaces. Specifically, we introduce two benchmarks built within the game Minecraft where the observations are pixel arrays of the agent's visual field. A combination of our two algorithmic techniques performs competitively on the standard reinforcement-learning tasks while consistently and substantially outperforming baselines on the two tasks with high-dimensional observation spaces. The new function approximator, exploration strategy, and evaluation benchmarks are each of independent interest in the pursuit of reinforcement-learning methods that scale to real-world domains.

READ FULL TEXT

page 1

page 6

research
05/31/2022

k-Means Maximum Entropy Exploration

Exploration in high-dimensional, continuous spaces with sparse rewards i...
research
02/04/2014

Safe Exploration of State and Action Spaces in Reinforcement Learning

In this paper, we consider the important problem of safe exploration in ...
research
07/03/2015

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

Achieving efficient and scalable exploration in complex domains poses a ...
research
06/17/2021

Adapting the Function Approximation Architecture in Online Reinforcement Learning

The performance of a reinforcement learning (RL) system depends on the c...
research
04/22/2022

Deep Reinforcement Learning Using a Low-Dimensional Observation Filter for Visual Complex Video Game Playing

Deep Reinforcement Learning (DRL) has produced great achievements since ...
research
07/30/2019

Wasserstein Robust Reinforcement Learning

Reinforcement learning algorithms, though successful, tend to over-fit t...
research
04/30/2023

Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward

We propose Structured Exploration with Achievements (SEA), a multi-stage...

Please sign up or login with your details

Forgot password? Click here to reset