Generative Adversarial Imitation Learning

06/10/2016 ∙ by Jonathan Ho, et al. ∙ 0

Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

Code Repositories

anirban-imitation

Extension of (Generative Adversarial Imitation Learning)[https://arxiv.org/abs/1606.03476]


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.