Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate

03/08/2020
by   Yufeng Zhang, et al.
0

Generative adversarial imitation learning (GAIL) demonstrates tremendous success in practice, especially when combined with neural networks. Different from reinforcement learning, GAIL learns both policy and reward function from expert (human) demonstration. Despite its empirical success, it remains unclear whether GAIL with neural networks converges to the globally optimal solution. The major difficulty comes from the nonconvex-nonconcave minimax optimization structure. To bridge the gap between practice and theory, we analyze a gradient-based algorithm with alternating updates and establish its sublinear convergence to the globally optimal solution. To the best of our knowledge, our analysis establishes the global optimality and convergence rate of GAIL with neural networks for the first time.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/09/2020

On Computation and Generalization of Generative Adversarial Imitation Learning

Generative Adversarial Imitation Learning (GAIL) is a powerful and pract...
01/11/2019

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

We study the global convergence of generative adversarial imitation lear...
06/24/2020

When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence

Generative adversarial imitation learning (GAIL) is a popular inverse re...
08/19/2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation

In generative adversarial imitation learning (GAIL), the agent aims to l...
02/22/2020

Global Convergence and Variance-Reduced Optimization for a Class of Nonconvex-Nonconcave Minimax Problems

Nonconvex minimax problems appear frequently in emerging machine learnin...
05/24/2019

Neural Temporal-Difference Learning Converges to Global Optima

Temporal-difference learning (TD), coupled with neural networks, is amon...
05/31/2020

Global Convergence of MAML for LQR

The paper studies the performance of the Model-Agnostic Meta-Learning (M...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.