SimGAN: Hybrid Simulator Identification for Domain Adaptation via Adversarial Reinforcement Learning

by   Yifeng Jiang, et al.

As learning-based approaches progress towards automating robot controllers design, transferring learned policies to new domains with different dynamics (e.g. sim-to-real transfer) still demands manual effort. This paper introduces SimGAN, a framework to tackle domain adaptation by identifying a hybrid physics simulator to match the simulated trajectories to the ones from the target domain, using a learned discriminative loss to address the limitations associated with manual loss design. Our hybrid simulator combines neural networks and traditional physics simulaton to balance expressiveness and generalizability, and alleviates the need for a carefully selected parameter set in System ID. Once the hybrid simulator is identified via adversarial reinforcement learning, it can be used to refine policies for the target domain, without the need to collect more data. We show that our approach outperforms multiple strong baselines on six robotic locomotion tasks for domain adaptation.


page 1

page 2

page 3

page 4


Learning Sampling Policies for Domain Adaptation

We address the problem of semi-supervised domain adaptation of classific...

EPOpt: Learning Robust Neural Network Policies Using Model Ensembles

Sample complexity and safety are major challenges when learning policies...

Domain Adversarial Reinforcement Learning for Partial Domain Adaptation

Partial domain adaptation aims to transfer knowledge from a label-rich s...

Fisher Deep Domain Adaptation

Deep domain adaptation models learn a neural network in an unlabeled tar...

Compound Domain Adaptation in an Open World

Existing works on domain adaptation often assume clear boundaries betwee...

Policy Transfer via Kinematic Domain Randomization and Adaptation

Transferring reinforcement learning policies trained in physics simulati...

Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes

In this paper, we automatically create sentiment dictionaries for predic...

Please sign up or login with your details

Forgot password? Click here to reset