Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations

05/05/2022
by   Sangwon Seo, et al.
0

We present Bayesian Team Imitation Learner (BTIL), an imitation learning algorithm to model the behavior of teams performing sequential tasks in Markovian domains. In contrast to existing multi-agent imitation learning techniques, BTIL explicitly models and infers the time-varying mental states of team members, thereby enabling learning of decentralized team policies from demonstrations of suboptimal teamwork. Further, to allow for sample- and label-efficient policy learning from small datasets, BTIL employs a Bayesian perspective and is capable of learning from semi-supervised demonstrations. We demonstrate and benchmark the performance of BTIL on synthetic multi-agent tasks as well as a novel dataset of human-agent teamwork. Our experiments show that BTIL can successfully learn team policies from demonstrations despite the influence of team members' (time-varying and potentially misaligned) mental states on their behavior.

READ FULL TEXT

page 2

page 11

research
04/01/2019

Generative predecessor models for sample-efficient imitation learning

We propose Generative Predecessor Models for Imitation Learning (GPRIL),...
research
03/01/2023

Automated Task-Time Interventions to Improve Teamwork using Imitation Learning

Effective human-human and human-autonomy teamwork is critical but often ...
research
09/20/2022

LEMURS: Learning Distributed Multi-Robot Interactions

This paper presents LEMURS, an algorithm for learning scalable multi-rob...
research
09/23/2021

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

In this paper, we consider the problem of autonomous driving using imita...
research
02/17/2021

Towards an AI Coach to Infer Team Mental Model Alignment in Healthcare

Shared mental models are critical to team success; however, in practice,...
research
04/12/2019

Few-Shot Bayesian Imitation Learning with Logic over Programs

We describe an expressive class of policies that can be efficiently lear...
research
07/28/2020

Team Deep Mixture of Experts for Distributed Power Control

In the context of wireless networking, it was recently shown that multip...

Please sign up or login with your details

Forgot password? Click here to reset