Online Baum-Welch algorithm for Hierarchical Imitation Learning

03/22/2021
by   Vittorio Giammarino, et al.
0

The options framework for hierarchical reinforcement learning has increased its popularity in recent years and has made improvements in tackling the scalability problem in reinforcement learning. Yet, most of these recent successes are linked with a proper options initialization or discovery. When an expert is available, the options discovery problem can be addressed by learning an options-type hierarchical policy directly from expert demonstrations. This problem is referred to as hierarchical imitation learning and can be handled as an inference problem in a Hidden Markov Model, which is done via an Expectation-Maximization type algorithm. In this work, we propose a novel online algorithm to perform hierarchical imitation learning in the options framework. Further, we discuss the benefits of such an algorithm and compare it with its batch version in classical reinforcement learning benchmarks. We show that this approach works well in both discrete and continuous environments and, under certain conditions, it outperforms the batch version.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2020

Provable Hierarchical Imitation Learning via EM

Due to recent empirical successes, the options framework for hierarchica...
research
06/10/2021

Adversarial Option-Aware Hierarchical Imitation Learning

It has been a challenge to learning skills for an agent from long-horizo...
research
10/15/2017

DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations

An option is a short-term skill consisting of a control policy for a spe...
research
12/04/2018

Hyperbolic Embeddings for Learning Options in Hierarchical Reinforcement Learning

Hierarchical reinforcement learning deals with the problem of breaking d...
research
10/05/2022

Hierarchical Adversarial Inverse Reinforcement Learning

Hierarchical Imitation Learning (HIL) has been proposed to recover highl...
research
03/01/2018

Hierarchical Imitation and Reinforcement Learning

We study the problem of learning policies over long time horizons. We pr...
research
06/07/2023

Divide and Repair: Using Options to Improve Performance of Imitation Learning Against Adversarial Demonstrations

We consider the problem of learning to perform a task from demonstration...

Please sign up or login with your details

Forgot password? Click here to reset