Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation

08/19/2021
by   Zhihan Liu, et al.
0

In generative adversarial imitation learning (GAIL), the agent aims to learn a policy from an expert demonstration so that its performance cannot be discriminated from the expert policy on a certain predefined reward set. In this paper, we study GAIL in both online and offline settings with linear function approximation, where both the transition and reward function are linear in the feature maps. Besides the expert demonstration, in the online setting the agent can interact with the environment, while in the offline setting the agent only accesses an additional dataset collected by a prior. For online GAIL, we propose an optimistic generative adversarial policy optimization algorithm (OGAP) and prove that OGAP achieves 𝒪(H^2 d^3/2K^1/2+KH^3/2dN_1^-1/2) regret. Here N_1 represents the number of trajectories of the expert demonstration, d is the feature dimension, and K is the number of episodes. For offline GAIL, we propose a pessimistic generative adversarial policy optimization algorithm (PGAP). For an arbitrary additional dataset, we obtain the optimality gap of PGAP, achieving the minimax lower bound in the utilization of the additional dataset. Assuming sufficient coverage on the additional dataset, we show that PGAP achieves 𝒪(H^2dK^-1/2 +H^2d^3/2N_2^-1/2+H^3/2dN_1^-1/2 ) optimality gap. Here N_2 represents the number of trajectories of the additional dataset with sufficient coverage.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/09/2020

On Computation and Generalization of Generative Adversarial Imitation Learning

Generative Adversarial Imitation Learning (GAIL) is a powerful and pract...
06/06/2021

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

This paper studies offline Imitation Learning (IL) where an agent learns...
06/24/2020

When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence

Generative adversarial imitation learning (GAIL) is a popular inverse re...
12/03/2018

Generative Adversarial Self-Imitation Learning

This paper explores a simple regularizer for reinforcement learning by p...
03/08/2020

Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate

Generative adversarial imitation learning (GAIL) demonstrates tremendous...
03/22/2021

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Offline (or batch) reinforcement learning (RL) algorithms seek to learn ...
01/11/2019

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

We study the global convergence of generative adversarial imitation lear...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.