Provable Sim-to-real Transfer in Continuous Domain with Partial Observations

10/27/2022
by   Jiachen Hu, et al.
0

Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world. Sim-to-real transfer has been widely used in practice because it is often cheaper, safer and much faster to collect samples in simulation than in the real world. Despite the empirical success of the sim-to-real transfer, its theoretical foundation is much less understood. In this paper, we study the sim-to-real transfer in continuous domain with partial observations, where the simulated environments and real-world environments are modeled by linear quadratic Gaussian (LQG) systems. We show that a popular robust adversarial training algorithm is capable of learning a policy from the simulated environment that is competitive to the optimal policy in the real-world environment. To achieve our results, we design a new algorithm for infinite-horizon average-cost LQGs and establish a regret bound that depends on the intrinsic complexity of the model class. Our algorithm crucially relies on a novel history clipping scheme, which might be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2021

Understanding Domain Randomization for Sim-to-real Transfer

Reinforcement learning encounters many challenges when applied directly ...
research
09/27/2020

Predicting Sim-to-Real Transfer with Probabilistic Dynamics Models

We propose a method to predict the sim-to-real transfer performance of R...
research
10/30/2019

Continuous Control with Contexts, Provably

A fundamental challenge in artificial intelligence is to build an agent ...
research
12/11/2020

Protective Policy Transfer

Being able to transfer existing skills to new situations is a key capabi...
research
05/09/2022

Learning A Simulation-based Visual Policy for Real-world Peg In Unseen Holes

This paper proposes a learning-based visual peg-in-hole that enables tra...
research
06/13/2022

Towards Autonomous Grading In The Real World

In this work, we aim to tackle the problem of autonomous grading, where ...
research
05/18/2023

Bayesian Risk-Averse Q-Learning with Streaming Observations

We consider a robust reinforcement learning problem, where a learning ag...

Please sign up or login with your details

Forgot password? Click here to reset