Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions

06/19/2021
by   Tian Xu, et al.
0

This paper is dedicated to designing provably efficient adversarial imitation learning (AIL) algorithms that directly optimize policies from expert demonstrations. Firstly, we develop a transition-aware AIL algorithm named TAIL with an expert sample complexity of Õ(H^3/2 |S|/ε) under the known transition setting, where H is the planning horizon, |S| is the state space size and ε is desired policy value gap. This improves upon the previous best bound of Õ(H^2 |S| / ε^2) for AIL methods and matches the lower bound of Ω̃ (H^3/2 |S|/ε) in [Rajaraman et al., 2021] up to a logarithmic factor. The key ingredient of TAIL is a fine-grained estimator for expert state-action distribution, which explicitly utilizes the transition function information. Secondly, considering practical settings where the transition functions are usually unknown but environment interaction is allowed, we accordingly develop a model-based transition-aware AIL algorithm named MB-TAIL. In particular, MB-TAIL builds an empirical transition model by interacting with the environment and performs imitation under the recovered empirical model. The interaction complexity of MB-TAIL is Õ (H^3 |S|^2 |A| / ε^2), which improves the best known result of Õ (H^4 |S|^2 |A| / ε^2) in [Shani et al., 2021]. Finally, our theoretical results are supported by numerical evaluation and detailed analysis on two challenging MDPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2023

Provably Efficient Adversarial Imitation Learning with Unknown Transitions

Imitation learning (IL) has proven to be an effective method for learnin...
research
02/25/2021

Provably Breaking the Quadratic Error Compounding Barrier in Imitation Learning, Optimally

We study the statistical limits of Imitation Learning (IL) in episodic M...
research
09/13/2020

Toward the Fundamental Limits of Imitation Learning

Imitation learning (IL) aims to mimic the behavior of an expert policy i...
research
04/25/2022

Imitation Learning from Observations under Transition Model Disparity

Learning to perform tasks by leveraging a dataset of expert observations...
research
07/20/2017

RAIL: Risk-Averse Imitation Learning

Imitation learning algorithms learn viable policies by imitating an expe...
research
02/27/2020

State-only Imitation with Transition Dynamics Mismatch

Imitation Learning (IL) is a popular paradigm for training agents to ach...
research
02/07/2023

Layered State Discovery for Incremental Autonomous Exploration

We study the autonomous exploration (AX) problem proposed by Lim Aue...

Please sign up or login with your details

Forgot password? Click here to reset