Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate

03/08/2020
by   Yufeng Zhang, et al.
0

Generative adversarial imitation learning (GAIL) demonstrates tremendous success in practice, especially when combined with neural networks. Different from reinforcement learning, GAIL learns both policy and reward function from expert (human) demonstration. Despite its empirical success, it remains unclear whether GAIL with neural networks converges to the globally optimal solution. The major difficulty comes from the nonconvex-nonconcave minimax optimization structure. To bridge the gap between practice and theory, we analyze a gradient-based algorithm with alternating updates and establish its sublinear convergence to the globally optimal solution. To the best of our knowledge, our analysis establishes the global optimality and convergence rate of GAIL with neural networks for the first time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2020

On Computation and Generalization of Generative Adversarial Imitation Learning

Generative Adversarial Imitation Learning (GAIL) is a powerful and pract...
research
01/11/2019

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

We study the global convergence of generative adversarial imitation lear...
research
06/24/2020

When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence

Generative adversarial imitation learning (GAIL) is a popular inverse re...
research
08/19/2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation

In generative adversarial imitation learning (GAIL), the agent aims to l...
research
06/25/2019

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

Proximal policy optimization and trust region policy optimization (PPO a...
research
05/24/2019

Neural Temporal-Difference Learning Converges to Global Optima

Temporal-difference learning (TD), coupled with neural networks, is amon...
research
08/09/2022

Training Overparametrized Neural Networks in Sublinear Time

The success of deep learning comes at a tremendous computational and ene...

Please sign up or login with your details

Forgot password? Click here to reset