Sample-Efficient On-Policy Imitation Learning from Observations

06/16/2023
by   Joao A. Candido Ramos, et al.
0

Imitation learning from demonstrations (ILD) aims to alleviate numerous shortcomings of reinforcement learning through the use of demonstrations. However, in most real-world applications, expert action guidance is absent, making the use of ILD impossible. Instead, we consider imitation learning from observations (ILO), where no expert actions are provided, making it a significantly more challenging problem to address. Existing methods often employ on-policy learning, which is known to be sample-costly. This paper presents SEILO, a novel sample-efficient on-policy algorithm for ILO, that combines standard adversarial imitation learning with inverse dynamics modeling. This approach enables the agent to receive feedback from both the adversarial procedure and a behavior cloning loss. We empirically demonstrate that our proposed algorithm requires fewer interactions with the environment to achieve expert performance compared to other state-of-the-art on-policy ILO and ILD methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2018

Addressing Sample Inefficiency and Reward Bias in Inverse Reinforcement Learning

The Generative Adversarial Imitation Learning (GAIL) framework from Ho &...
research
02/25/2021

Off-Policy Imitation Learning from Observations

Learning from Observations (LfO) is a practical reinforcement learning s...
research
01/21/2020

Loss-annealed GAIL for sample efficient and stable Imitation Learning

Imitation learning is the problem of learning a policy from an expert po...
research
06/12/2018

Model-Based Imitation Learning with Accelerated Convergence

Sample efficiency is critical in solving real-world reinforcement learni...
research
06/22/2022

Imitation Learning for Generalizable Self-driving Policy with Sim-to-real Transfer

Imitation Learning uses the demonstrations of an expert to uncover the o...
research
03/04/2022

Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization

Recent progress in state-only imitation learning extends the scope of ap...
research
09/18/2020

Compressed imitation learning

In analogy to compressed sensing, which allows sample-efficient signal r...

Please sign up or login with your details

Forgot password? Click here to reset