A Coupled Flow Approach to Imitation Learning

04/29/2023
by   Gideon Freund, et al.
0

In reinforcement learning and imitation learning, an object of central importance is the state distribution induced by the policy. It plays a crucial role in the policy gradient theorem, and references to it–along with the related state-action distribution–can be found all across the literature. Despite its importance, the state distribution is mostly discussed indirectly and theoretically, rather than being modeled explicitly. The reason being an absence of appropriate density estimation tools. In this work, we investigate applications of a normalizing flow-based model for the aforementioned distributions. In particular, we use a pair of flows coupled through the optimality point of the Donsker-Varadhan representation of the Kullback-Leibler (KL) divergence, for distribution matching based imitation learning. Our algorithm, Coupled Flow Imitation Learning (CFIL), achieves state-of-the-art performance on benchmark tasks with a single expert trajectory and extends naturally to a variety of other settings, including the subsampled and state-only regimes.

READ FULL TEXT

page 7

page 15

research
06/06/2021

SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching

We present SoftDICE, which achieves state-of-the-art performance for imi...
research
08/04/2021

A Pragmatic Look at Deep Imitation Learning

The introduction of the generative adversarial imitation learning (GAIL)...
research
02/15/2020

Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning

This work considers two distinct settings: imitation learning and goal-c...
research
08/31/2018

Imitation Learning for Neural Morphological String Transduction

We employ imitation learning to train a neural transition-based string t...
research
08/03/2022

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis

Imitation learning learns a policy from expert trajectories. While the e...
research
05/15/2019

Simitate: A Hybrid Imitation Learning Benchmark

We present Simitate --- a hybrid benchmarking suite targeting the evalua...
research
06/25/2020

Strictly Batch Imitation Learning by Energy-based Distribution Matching

Consider learning a policy purely on the basis of demonstrated behavior—...

Please sign up or login with your details

Forgot password? Click here to reset