Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

06/09/2022
by   Yonathan Efroni, et al.
22

In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. Learning from high-dimensional observations has been the subject of extensive investigation in supervised learning and statistics (e.g., via sparsity), but analogous issues in reinforcement learning are not well understood, even in finite state/action (tabular) domains. We introduce a new problem setting for reinforcement learning, the Exogenous Markov Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable (or, endogenous) component and a large irrelevant (or, exogenous) component; the exogenous component is independent of the learner's actions, but evolves in an arbitrary, temporally correlated fashion. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity polynomial in the size of the endogenous component and nearly independent of the size of the exogenous component, thereby offering a doubly-exponential improvement over off-the-shelf algorithms. Our results highlight for the first time that sample-efficient reinforcement learning is possible in the presence of exogenous information, and provide a simple, user-friendly benchmark for investigation going forward.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2020

On the Sample Complexity of Reinforcement Learning with Policy Space Generalization

We study the optimal sample complexity in large-scale Reinforcement Lear...
research
10/08/2020

Learning the Linear Quadratic Regulator from Nonlinear Observations

We introduce a new problem setting for continuous control called the LQR...
research
04/20/2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations

Despite the success of reinforcement learning (RL) for Markov decision p...
research
10/23/2019

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of lea...
research
11/28/2022

Inapplicable Actions Learning for Knowledge Transfer in Reinforcement Learning

Reinforcement Learning (RL) algorithms are known to scale poorly to envi...
research
06/01/2022

Byzantine-Robust Online and Offline Distributed Reinforcement Learning

We consider a distributed reinforcement learning setting where multiple ...
research
11/01/2018

Temporal Regularization in Markov Decision Process

Several applications of Reinforcement Learning suffer from instability d...

Please sign up or login with your details

Forgot password? Click here to reset