Imitation Learning via Differentiable Physics

06/10/2022
by   Siwei Chen, et al.
8

Existing imitation learning (IL) methods such as inverse reinforcement learning (IRL) usually have a double-loop training process, alternating between learning a reward function and a policy and tend to suffer long training time and high variance. In this work, we identify the benefits of differentiable physics simulators and propose a new IL method, i.e., Imitation Learning via Differentiable Physics (ILD), which gets rid of the double-loop design and achieves significant improvements in final performance, convergence speed, and stability. The proposed ILD incorporates the differentiable physics simulator as a physics prior into its computational graph for policy learning. It unrolls the dynamics by sampling actions from a parameterized policy, simply minimizing the distance between the expert trajectory and the agent trajectory, and back-propagating the gradient into the policy via temporal physics operators. With the physics prior, ILD policies can not only be transferable to unseen environment specifications but also yield higher final performance on a variety of tasks. In addition, ILD naturally forms a single-loop structure, which significantly improves the stability and training speed. To simplify the complex optimization landscape induced by temporal physics operations, ILD dynamically selects the learning objectives for each state during optimization. In our experiments, we show that ILD outperforms state-of-the-art methods in a variety of continuous control tasks with Brax, requiring only one expert demonstration. In addition, ILD can be applied to challenging deformable object manipulation tasks and can be generalized to unseen configurations.

READ FULL TEXT
research
10/24/2022

Benchmarking Deformable Object Manipulation with Differentiable Physics

Deformable Object Manipulation (DOM) is of significant importance to bot...
research
06/05/2020

Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration

The generative adversarial imitation learning (GAIL) has provided an adv...
research
03/24/2021

On Imitation Learning of Linear Control Policies: Enforcing Stability and Robustness Constraints via LMI Conditions

When applying imitation learning techniques to fit a policy from expert ...
research
06/30/2022

Watch and Match: Supercharging Imitation with Regularized Optimal Transport

Imitation learning holds tremendous promise in learning policies efficie...
research
12/23/2020

Augmenting Policy Learning with Routines Discovered from a Single Demonstration

Humans can abstract prior knowledge from very little data and use it to ...
research
04/06/2023

DiffMimic: Efficient Motion Mimicking with Differentiable Physics

Motion mimicking is a foundational task in physics-based character anima...
research
10/14/2022

αQBoost: An Iteratively Weighted Adiabatic Trained Classifier

A new implementation of an adiabatically-trained ensemble model is deriv...

Please sign up or login with your details

Forgot password? Click here to reset