Genetic Imitation Learning by Reward Extrapolation

01/03/2023
by   Boyuan Zheng, et al.
0

Imitation learning demonstrates remarkable performance in various domains. However, imitation learning is also constrained by many prerequisites. The research community has done intensive research to alleviate these constraints, such as adding the stochastic policy to avoid unseen states, eliminating the need for action labels, and learning from the suboptimal demonstrations. Inspired by the natural reproduction process, we proposed a method called GenIL that integrates the Genetic Algorithm with imitation learning. The involvement of the Genetic Algorithm improves the data efficiency by reproducing trajectories with various returns and assists the model in estimating more accurate and compact reward function parameters. We tested GenIL in both Atari and Mujoco domains, and the result shows that it successfully outperforms the previous extrapolation methods over extrapolation accuracy, robustness, and overall policy performance when input data is limited.

READ FULL TEXT
research
05/22/2019

Imitation Learning from Video by Leveraging Proprioception

Classically, imitation learning algorithms have been developed for ideal...
research
01/03/2023

Explaining Imitation Learning through Frames

As one of the prevalent methods to achieve automation systems, Imitation...
research
05/25/2021

Hyperparameter Selection for Imitation Learning

We address the issue of tuning hyperparameters (HPs) for imitation learn...
research
02/21/2020

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Bayesian reward learning from demonstrations enables rigorous safety and...
research
05/06/2022

Diverse Imitation Learning via Self-Organizing Generative Models

Imitation learning is the task of replicating expert policy from demonst...
research
06/11/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

The difficulty in specifying rewards for many real-world problems has le...
research
06/08/2022

Constrained Imitation Learning for a Flapping Wing Unmanned Aerial Vehicle

This paper presents a data-driven optimal control policy for a micro fla...

Please sign up or login with your details

Forgot password? Click here to reset