Hierarchical Adversarial Inverse Reinforcement Learning

10/05/2022
by   Jiayu Chen, et al.
0

Hierarchical Imitation Learning (HIL) has been proposed to recover highly-complex behaviors in long-horizontal tasks from expert demonstrations by modeling the task hierarchy with the option framework. Existing methods either overlook the causal relationship between the subtask and its corresponding policy or fail to learn the policy in an end-to-end fashion, which leads to suboptimality. In this work, we develop a novel HIL algorithm based on Adversarial Inverse Reinforcement Learning and adapt it with the Expectation-Maximization algorithm in order to directly recover a hierarchical policy from the unannotated demonstrations. Further, we introduce a directed information term to the objective function to enhance the causality and propose a Variational Autoencoder framework for learning with our objectives in an end-to-end fashion. Theoretical justifications and evaluations on challenging robotic control tasks are provided to show the superiority of our algorithm. The codes are available at https://github.com/LucasCJYSDL/HierAIRL.

READ FULL TEXT

page 6

page 7

research
06/10/2021

Adversarial Option-Aware Hierarchical Imitation Learning

It has been a challenge to learning skills for an agent from long-horizo...
research
05/22/2023

Multi-task Hierarchical Adversarial Inverse Reinforcement Learning

Multi-task Imitation Learning (MIL) aims to train a policy capable of pe...
research
04/13/2020

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Fashion is a complex social phenomenon. People follow fashion styles fro...
research
03/22/2021

Online Baum-Welch algorithm for Hierarchical Imitation Learning

The options framework for hierarchical reinforcement learning has increa...
research
12/16/2021

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning

Effective exploration continues to be a significant challenge that preve...
research
10/27/2021

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Behavioral cloning has proven to be effective for learning sequential de...
research
08/10/2023

RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End Robust Estimation

Robust estimation is a crucial and still challenging task, which involve...

Please sign up or login with your details

Forgot password? Click here to reset