Active Imitation Learning with Noisy Guidance

05/26/2020 ∙ by Kianté Brantley, et al. ∙ University of Maryland 9

Imitation learning algorithms provide state-of-the-art results on many structured prediction tasks by learning near-optimal search policies. Such algorithms assume training-time access to an expert that can provide the optimal action at any queried state; unfortunately, the number of such queries is often prohibitive, frequently rendering these approaches impractical. To combat this query complexity, we consider an active learning setting in which the learning algorithm has additional access to a much cheaper noisy heuristic that provides noisy guidance. Our algorithm, LEAQI, learns a difference classifier that predicts when the expert is likely to disagree with the heuristic, and queries the expert only when necessary. We apply LEAQI to three sequence labeling tasks, demonstrating significantly fewer queries to the expert and comparable (or better) accuracies over a passive approach.



We thank Rob Schapire, Chicheng Zhang, and the anonymous ACL reviewers for very helpful comments and insights. This material is based upon work supported by the National Science Foundation under Grant No. 1618193 and an ACM SIGHPC/Intel Computational and Data Science Fellowship to KB. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation nor of the ACM.


