Validate on Sim, Detect on Real – Model Selection for Domain Randomization

11/01/2021
by   Gal Leibovich, et al.
0

A practical approach to learning robot skills, often termed sim2real, is to train control policies in simulation and then deploy them on a real robot. Popular techniques to improve the sim2real transfer build on domain randomization (DR): Training the policy on a diverse set of randomly generated domains with the hope of better generalization to the real world. Due to the large number of hyper-parameters in both the policy learning and DR algorithms, one often ends up with a large number of trained models, where choosing the best model among them demands costly evaluation on the real robot. In this work we ask: Can we rank the policies without running them in the real world? Our main idea is that a predefined set of real world data can be used to evaluate all policies, using out-of-distribution detection (OOD) techniques. In a sense, this approach can be seen as a "unit test" to evaluate policies before any real world execution. However, we find that by itself, the OOD score can be inaccurate and very sensitive to the particular OOD method. Our main contribution is a simple-yet-effective policy score that combines OOD with an evaluation in simulation. We show that our score - VSDR - can significantly improve the accuracy of policy ranking without requiring additional real world data. We evaluate the effectiveness of VSDR on sim2real transfer in a robotic grasping task with image inputs. We extensively evaluate different DR parameters and OOD methods, and show that VSDR improves policy selection across the board. More importantly, our method achieves significantly better ranking, and uses significantly less data compared to baselines.

READ FULL TEXT

page 1

page 5

page 6

research
07/28/2023

Robust Visual Sim-to-Real Transfer for Robotic Manipulation

Learning visuomotor policies in simulation is much safer and cheaper tha...
research
06/02/2019

Learning Domain Randomization Distributions for Transfer of Locomotion Policies

Domain randomization (DR) is a successful technique for learning robust ...
research
10/12/2018

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

We consider the problem of transferring policies to the real world by tr...
research
10/07/2021

Understanding Domain Randomization for Sim-to-real Transfer

Reinforcement learning encounters many challenges when applied directly ...
research
03/18/2019

Learning to Augment Synthetic Images for Sim2Real Policy Transfer

Vision and learning have made significant progress that could improve ro...
research
09/23/2022

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

Training a robust policy is critical for policy deployment in real-world...
research
01/31/2023

Revisiting Bellman Errors for Offline Model Selection

Offline model selection (OMS), that is, choosing the best policy from a ...

Please sign up or login with your details

Forgot password? Click here to reset