Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

by   Paul Christiano, et al.

Developing control policies in simulation is often more practical and safer than directly running experiments in the real world. This applies to policies obtained from planning and optimization, and even more so to policies obtained from reinforcement learning, which is often very data demanding. However, a policy that succeeds in simulation often doesn't work when deployed on a real robot. Nevertheless, often the overall gist of what the policy does in simulation remains valid in the real world. In this paper we investigate such settings, where the sequence of states traversed in simulation remains reasonable for the real world, even if the details of the controls are not, as could be the case when the key differences lie in detailed friction, contact, mass and geometry properties. During execution, at each time step our approach computes what the simulation-based control policy would do, but then, rather than executing these controls on the real robot, our approach computes what the simulation expects the resulting next state(s) will be, and then relies on a learned deep inverse dynamics model to decide which real-world action is most suitable to achieve those next states. Deep models are only as good as their training data, and we also propose an approach for data collection to (incrementally) learn the deep inverse dynamics model. Our experiments shows our approach compares favorably with various baselines that have been developed for dealing with simulation to real world model discrepancy, including output error control and Gaussian dynamics adaptation.


page 3

page 4


Learning Real-World Robot Policies by Dreaming

Learning to control robots directly based on images is a primary challen...

A New Data Source for Inverse Dynamics Learning

Modern robotics is gravitating toward increasingly collaborative human r...

Establishing Appropriate Trust via Critical States

In order to effectively interact with or supervise a robot, humans need ...

TAMPC: A Controller for Escaping Traps in Novel Environments

We propose an approach to online model adaptation and control in the cha...

Predicting Sim-to-Real Transfer with Probabilistic Dynamics Models

We propose a method to predict the sim-to-real transfer performance of R...

Feedback is All You Need: Real-World Reinforcement Learning with Approximate Physics-Based Models

We focus on developing efficient and reliable policy optimization strate...

Beyond Basins of Attraction: Evaluating Robustness of Natural Dynamics

It is commonly accepted that properly designing a system to exhibit favo...

Please sign up or login with your details

Forgot password? Click here to reset