PAC Reinforcement Learning without Real-World Feedback

09/23/2019
by   Yuren Zhong, et al.
0

This work studies reinforcement learning in the Sim-to-Real setting, in which an agent is first trained on a number of simulators before being deployed in the real world, with the aim of decreasing the real-world sample complexity requirement. Using a dynamic model known as a rich observation Markov decision process (ROMDP), we formulate a theoretical framework for Sim-to-Real in the situation where feedback in the real world is not available. We establish real-world sample complexity guarantees that are smaller than what is currently known for directly (i.e., without access to simulators) learning a ROMDP with feedback.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2017

Sample Efficient Feature Selection for Factored MDPs

In reinforcement learning, the state of the real world is often represen...
research
03/29/2021

Robust Reinforcement Learning under model misspecification

Reinforcement learning has achieved remarkable performance in a wide ran...
research
10/29/2015

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

Recently, there has been significant progress in understanding reinforce...
research
09/25/2019

Model Imitation for Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) aims to learn a dynamic model ...
research
06/16/2020

The Teaching Dimension of Q-learning

In this paper, we initiate the study of sample complexity of teaching, t...
research
05/23/2019

PAC Guarantees for Concurrent Reinforcement Learning with Restricted Communication

We develop model free PAC performance guarantees for multiple concurrent...
research
07/01/2022

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

Due to its high sample complexity, simulation is, as of today, critical ...

Please sign up or login with your details

Forgot password? Click here to reset