Generative Adversarial Imitation Learning

by   Jonathan Ho, et al.
Stanford University

Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.


page 1

page 2

page 3

page 4


Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation

We consider the problem of imitation learning from a finite set of exper...

Event Extraction with Generative Adversarial Imitation Learning

We propose a new method for event extraction (EE) task based on an imita...

Generative Adversarial Imitation Learning for Empathy-based AI

Generative adversarial imitation learning (GAIL) is a model-free algorit...

Quantum Imitation Learning

Despite remarkable successes in solving various complex decision-making ...

Safe Trajectory Planning Using Reinforcement Learning for Self Driving

Self-driving vehicles must be able to act intelligently in diverse and d...

Energy-Based Imitation Learning

We tackle a common scenario in imitation learning (IL), where agents try...

A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models

Generative adversarial networks (GANs) are a recently proposed class of ...

Code Repositories


Extension of (Generative Adversarial Imitation Learning)[]

view repo

Please sign up or login with your details

Forgot password? Click here to reset