Log In Sign Up

Distilled Domain Randomization

by   Julien Brosseit, et al.

Deep reinforcement learning is an effective tool to learn robot control policies from scratch. However, these methods are notorious for the enormous amount of required training data which is prohibitively expensive to collect on real robots. A highly popular alternative is to learn from simulations, allowing to generate the data much faster, safer, and cheaper. Since all simulators are mere models of reality, there are inevitable differences between the simulated and the real data, often referenced as the 'reality gap'. To bridge this gap, many approaches learn one policy from a distribution over simulators. In this paper, we propose to combine reinforcement learning from randomized physics simulations with policy distillation. Our algorithm, called Distilled Domain Randomization (DiDoR), distills so-called teacher policies, which are experts on domains that have been sampled initially, into a student policy that is later deployed. This way, DiDoR learns controllers which transfer directly from simulation to reality, i.e., without requiring data from the target domain. We compare DiDoR against three baselines in three sim-to-sim as well as two sim-to-real experiments. Our results show that the target domain performance of policies trained with DiDoR is en par or better than the baselines'. Moreover, our approach neither increases the required memory capacity nor the time to compute an action, which may well be a point of failure for successfully deploying the learned controller.


page 1

page 6


Bayesian Domain Randomization for Sim-to-Real Transfer

When learning policies for robot control, the real-world data required i...

Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain Randomization

Deep reinforcement learning with domain randomization learns a control p...

Robot Learning from Randomized Simulations: A Review

The rise of deep learning has caused a paradigm shift in robotics resear...

Assessing Transferability from Simulation to Reality for Reinforcement Learning

Learning robot control policies from physics simulations is of great int...

Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped

Learning controllers for bipedal robots is a challenging problem, often ...

Policy Transfer via Kinematic Domain Randomization and Adaptation

Transferring reinforcement learning policies trained in physics simulati...