DeepAI AI Chat
Log In Sign Up

Model-Based Imitation Learning Using Entropy Regularization of Model and Policy

by   Eiji Uchibe, et al.

Approaches based on generative adversarial networks for imitation learning are promising because they are sample efficient in terms of expert demonstrations. However, training a generator requires many interactions with the actual environment because model-free reinforcement learning is adopted to update a policy. To improve the sample efficiency using model-based reinforcement learning, we propose model-based Entropy-Regularized Imitation Learning (MB-ERIL) under the entropy-regularized Markov decision process to reduce the number of interactions with the actual environment. MB-ERIL uses two discriminators. A policy discriminator distinguishes the actions generated by a robot from expert ones, and a model discriminator distinguishes the counterfactual state transitions generated by the model from the actual ones. We derive the structured discriminators so that the learning of the policy and the model is efficient. Computer simulations and real robot experiments show that MB-ERIL achieves a competitive performance and significantly improves the sample efficiency compared to baseline methods.


page 1

page 6

page 7


Error Bounds of Imitating Policies and Environments

Imitation learning trains a policy by mimicking expert demonstrations. V...

Imitation learning based on entropy-regularized forward and inverse reinforcement learning

This paper proposes Entropy-Regularized Imitation Learning (ERIL), which...

Backward Imitation and Forward Reinforcement Learning via Bi-directional Model Rollouts

Traditional model-based reinforcement learning (RL) methods generate for...

No Need for Interactions: Robust Model-Based Imitation Learning using Neural ODE

Interactions with either environments or expert policies during training...

PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes

We consider the problem of batch multi-task reinforcement learning with ...

Generative Adversarial Imitation Learning for Empathy-based AI

Generative adversarial imitation learning (GAIL) is a model-free algorit...

A Contraction Approach to Model-based Reinforcement Learning

Model-based Reinforcement Learning has shown considerable experimental s...