Error Bounds of Imitating Policies and Environments

10/22/2020
by   Tian Xu, et al.
0

Imitation learning trains a policy by mimicking expert demonstrations. Various imitation methods were proposed and empirically evaluated, meanwhile, their theoretical understanding needs further studies. In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation. The results support that generative adversarial imitation can reduce the compounding errors compared to behavioral cloning, and thus has a better sample complexity. Noticed that by considering the environment transition model as a dual agent, imitation learning can also be used to learn the environment model. Therefore, based on the bounds of imitating policies, we further analyze the performance of imitating environments. The results show that environment models can be more effectively imitated by generative adversarial imitation than behavioral cloning, suggesting a novel application of adversarial imitation for model-based reinforcement learning. We hope these results could inspire future advances in imitation learning and model-based reinforcement learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2019

On Value Discrepancy of Imitation Learning

Imitation learning trains a policy from expert demonstrations. Imitation...
research
03/19/2019

Hindsight Generative Adversarial Imitation Learning

Compared to reinforcement learning, imitation learning (IL) is a powerfu...
research
06/21/2022

Model-Based Imitation Learning Using Entropy Regularization of Model and Policy

Approaches based on generative adversarial networks for imitation learni...
research
03/08/2019

Dyna-AIL : Adversarial Imitation Learning by Planning

Adversarial methods for imitation learning have been shown to perform we...
research
07/03/2019

Integration of Imitation Learning using GAIL and Reinforcement Learning using Task-achievement Rewards via Probabilistic Generative Model

Integration of reinforcement learning and imitation learning is an impor...
research
07/23/2020

Bridging the Imitation Gap by Adaptive Insubordination

Why do agents often obtain better reinforcement learning policies when i...
research
03/31/2021

DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation

In imitation learning from observation IfO, a learning agent seeks to im...

Please sign up or login with your details

Forgot password? Click here to reset