Adversarially Guided Self-Play for Adopting Social Conventions

01/16/2020
by   Mycal Tucker, et al.
0

Robotic agents must adopt existing social conventions in order to be effective teammates. These social conventions, such as driving on the right or left side of the road, are arbitrary choices among optimal policies, but all agents on a successful team must use the same convention. Prior work has identified a method of combining self-play with paired input-output data gathered from existing agents in order to learn their social convention without interacting with them. We build upon this work by introducing a technique called Adversarial Self-Play (ASP) that uses adversarial training to shape the space of possible learned policies and substantially improves learning efficiency. ASP only requires the addition of unpaired data: a dataset of outputs produced by the social convention without associated inputs. Theoretical analysis reveals how ASP shapes the policy space and the circumstances (when behaviors are clustered or exhibit some other structure) under which it offers the greatest benefits. Empirical results across three domains confirm ASP's advantages: it produces models that more closely match the desired social convention when given as few as two paired datapoints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2018

Towards Abstraction in ASP with an Application on Reasoning about Agent Policies

ASP programs are a convenient tool for problem solving, whereas with lar...
research
12/11/2018

plasp 3: Towards Effective ASP Planning

We describe the new version of the PDDL-to-ASP translator plasp. First, ...
research
03/06/2021

Off-Belief Learning

The standard problem setting in Dec-POMDPs is self-play, where the goal ...
research
01/28/2020

Towards Learning Multi-agent Negotiations via Self-Play

Making sophisticated, robust, and safe sequential decisions is at the he...
research
06/08/2020

A Comparison of Self-Play Algorithms Under a Generalized Framework

Throughout scientific history, overarching theoretical frameworks have a...
research
12/06/2021

Invitation in Crowdsourcing Contests

In a crowdsourcing contest, a requester holding a task posts it to a cro...

Please sign up or login with your details

Forgot password? Click here to reset