Learning Belief Representations for Imitation Learning in POMDPs

06/22/2019
by   Tanmay Gangwani, et al.
3

We consider the problem of imitation learning from expert demonstrations in partially observable Markov decision processes (POMDPs). Belief representations, which characterize the distribution over the latent states in a POMDP, have been modeled using recurrent neural networks and probabilistic latent variable models, and shown to be effective for reinforcement learning in POMDPs. In this work, we investigate the belief representation learning problem for generative adversarial imitation learning in POMDPs. Instead of training the belief module and the policy separately as suggested in prior work, we learn the belief module jointly with the policy, using a task-aware imitation loss to ensure that the representation is more aligned with the policy's objective. To improve robustness of representation, we introduce several informative belief regularization techniques, including multi-step prediction of dynamics and action-sequences. Evaluated on various partially observable continuous-control locomotion tasks, our belief-module imitation learning approach (BMIL) substantially outperforms several baselines, including the original GAIL algorithm and the task-agnostic belief learning algorithm. Extensive ablation analysis indicates the effectiveness of task-aware belief learning and belief regularization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2022

Deconfounded Imitation Learning

Standard imitation learning can fail when the expert demonstrators have ...
research
01/31/2020

Domain-Adversarial and -Conditional State Space Model for Imitation Learning

State representation learning (SRL) in partially observable Markov decis...
research
02/24/2020

Provable Representation Learning for Imitation Learning via Bi-level Optimization

A common strategy in modern learning systems is to learn a representatio...
research
12/09/2020

Neural Rate Control for Video Encoding using Imitation Learning

In modern video encoders, rate control is a critical component and has b...
research
12/18/2019

Relational Mimic for Visual Adversarial Imitation Learning

In this work, we introduce a new method for imitation learning from vide...
research
10/07/2020

Provable Hierarchical Imitation Learning via EM

Due to recent empirical successes, the options framework for hierarchica...
research
10/13/2021

On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning

We consider the problem of using expert data with unobserved confounders...

Please sign up or login with your details

Forgot password? Click here to reset