Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

06/22/2021
by   Christoph Dann, et al.
10

There have been many recent advances on provably efficient Reinforcement Learning (RL) in problems with rich observation spaces. However, all these works share a strong realizability assumption about the optimal value function of the true MDP. Such realizability assumptions are often too strong to hold in practice. In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies Π that may not contain any near-optimal policy. We provide an algorithm for this setting whose error is bounded in terms of the rank d of the underlying MDP. Specifically, our algorithm enjoys a sample complexity bound of O((H^4d K^3dlog |Π|)/ϵ^2) where H is the length of episodes, K is the number of actions and ϵ>0 is the desired sub-optimality. We also provide a nearly matching lower bound for this agnostic setting that shows that the exponential dependence on rank is unavoidable, without further assumptions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2023

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

In reward-free reinforcement learning (RL), an agent explores the enviro...
research
06/07/2022

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

The practicality of reinforcement learning algorithms has been limited d...
research
02/29/2020

Learning Near Optimal Policies with Low Inherent Bellman Error

We study the exploration problem with approximate linear action-value fu...
research
10/23/2019

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of lea...
research
04/12/2023

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

We study the design of sample-efficient algorithms for reinforcement lea...
research
10/11/2022

Multi-User Reinforcement Learning with Low Rank Rewards

In this work, we consider the problem of collaborative multi-user reinfo...
research
06/15/2021

Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity

Reinforcement learning (RL) is empirically successful in complex nonline...

Please sign up or login with your details

Forgot password? Click here to reset