DeepAI
Log In Sign Up

Adaptive Discretization for Model-Based Reinforcement Learning

07/01/2020
by   Sean R. Sinclair, et al.
16

We introduce the technique of adaptive discretization to design efficient model-based episodic reinforcement learning algorithms in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space. From a theoretical perspective, we provide worst-case regret bounds for our algorithm, which are competitive compared to the state-of-the-art model-based algorithms; moreover, our bounds are obtained via a modular proof technique, which can potentially extend to incorporate additional structure on the problem. From an implementation standpoint, our algorithm has much lower storage and computational requirements, due to maintaining a more efficient partition of the state and action spaces. We illustrate this via experiments on several canonical control problems, which shows that our algorithm empirically performs significantly better than fixed discretization in terms of both faster convergence and lower memory usage. Interestingly, we observe empirically that while fixed-discretization model-based algorithms vastly outperform their model-free counterparts, the two achieve comparable performance with adaptive discretization.

READ FULL TEXT

page 24

page 25

page 27

10/17/2019

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

We present an efficient algorithm for model-free episodic reinforcement ...
10/29/2021

Adaptive Discretization in Online Reinforcement Learning

Discretization based approaches to solving online reinforcement learning...
03/09/2020

Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces

Despite the wealth of research into provably efficient reinforcement lea...
07/16/2018

Discrete linear-complexity reinforcement learning in continuous action spaces for Q-learning algorithms

In this article, we sketch an algorithm that extends the Q-learning algo...
06/22/2020

Adaptive Discretization for Adversarial Bandits with Continuous Action Spaces

Lipschitz bandits is a prominent version of multi-armed bandits that stu...
09/13/2022

Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs

Solving Partially Observable Markov Decision Processes (POMDPs) with con...
06/26/2011

Learning to Coordinate Efficiently: A Model-based Approach

In common-interest stochastic games all players receive an identical pay...

Code Repositories

AdaptiveQLearning

Adaptive Discretization for Reinforcement Learning in Metric Spaces


view repo