Theoretical Analysis of Offline Imitation With Supplementary Dataset

01/27/2023
by   Ziniu Li, et al.
0

Behavioral cloning (BC) can recover a good policy from abundant expert data, but may fail when expert data is insufficient. This paper considers a situation where, besides the small amount of expert data, a supplementary dataset is available, which can be collected cheaply from sub-optimal policies. Imitation learning with a supplementary dataset is an emergent practical framework, but its theoretical foundation remains under-developed. To advance understanding, we first investigate a direct extension of BC, called NBCU, that learns from the union of all available data. Our analysis shows that, although NBCU suffers an imitation gap that is larger than BC in the worst case, there exist special cases where NBCU performs better than or equally well as BC. This discovery implies that noisy data can also be helpful if utilized elaborately. Therefore, we further introduce a discriminator-based importance sampling technique to re-weight the supplementary data, proposing the WBCU method. With our newly developed landscape-based analysis, we prove that WBCU can outperform BC in mild conditions. Empirical studies show that WBCU simultaneously achieves the best performance on two challenging tasks where prior state-of-the-art methods fail.

READ FULL TEXT

page 33

page 34

research
06/11/2022

Model-based Offline Imitation Learning with Non-expert Data

Although Behavioral Cloning (BC) in theory suffers compounding errors, i...
research
03/03/2023

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement

Imitation learning aims to mimic the behavior of experts without explici...
research
07/01/2022

Discriminator-Guided Model-Based Offline Imitation Learning

Offline imitation learning (IL) is a powerful method to solve decision-m...
research
08/03/2022

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis

Imitation learning learns a policy from expert trajectories. While the e...
research
06/11/2023

Provably Efficient Adversarial Imitation Learning with Unknown Transitions

Imitation learning (IL) has proven to be an effective method for learnin...
research
05/30/2022

Minimax Optimal Online Imitation Learning via Replay Estimation

Online imitation learning is the problem of how best to mimic expert dem...

Please sign up or login with your details

Forgot password? Click here to reset