Integration of Imitation Learning using GAIL and Reinforcement Learning using Task-achievement Rewards via Probabilistic Generative Model

07/03/2019
by   Akira Kinose, et al.
0

Integration of reinforcement learning and imitation learning is an important problem that has been studied for a long time in the field of intelligent robotics. Reinforcement learning optimizes policies to maximize the cumulative reward, whereas imitation learning attempts to extract general knowledge about the trajectories demonstrated by experts, i.e., demonstrators. Because each of them has their own drawbacks, methods combining them and compensating for each set of drawbacks have been explored thus far. However, many of the methods are heuristic and do not have a solid theoretical basis. In this paper, we present a new theory for integrating reinforcement and imitation learning by extending the probabilistic generative model framework for reinforcement learning, plan by inference. We develop a new probabilistic graphical model for reinforcement learning with multiple types of rewards and a probabilistic graphical model for Markov decision processes with multiple optimality emissions (pMDP-MO). Furthermore, we demonstrate that the integrated learning method of reinforcement learning and imitation learning can be formulated as a probabilistic inference of policies on pMDP-MO by considering the output of the discriminator in generative adversarial imitation learning as an additional optimal emission observation. We adapt the generative adversarial imitation learning and task-achievement reward to our proposed framework, achieving significantly better performance than agents trained with reinforcement learning or imitation learning alone. Experiments demonstrate that our framework successfully integrates imitation and reinforcement learning even when the number of demonstrators is only a few.

READ FULL TEXT
research
10/22/2020

Error Bounds of Imitating Policies and Environments

Imitation learning trains a policy by mimicking expert demonstrations. V...
research
10/22/2022

Cut-and-Approximate: 3D Shape Reconstruction from Planar Cross-sections with Deep Reinforcement Learning

Current methods for 3D object reconstruction from a set of planar cross-...
research
04/21/2018

Event Extraction with Generative Adversarial Imitation Learning

We propose a new method for event extraction (EE) task based on an imita...
research
09/08/2019

Imitation Learning for Human Pose Prediction

Modeling and prediction of human motion dynamics has long been a challen...
research
06/28/2017

Energy-Based Sequence GANs for Recommendation and Their Connection to Imitation Learning

Recommender systems aim to find an accurate and efficient mapping from h...
research
01/30/2023

Hierarchical Imitation Learning with Vector Quantized Models

The ability to plan actions on multiple levels of abstraction enables in...

Please sign up or login with your details

Forgot password? Click here to reset