Stackelberg Games for Learning Emergent Behaviors During Competitive Autocurricula

05/04/2023
by   Boling Yang, et al.
0

Autocurricular training is an important sub-area of multi-agent reinforcement learning (MARL) that allows multiple agents to learn emergent skills in an unsupervised co-evolving scheme. The robotics community has experimented autocurricular training with physically grounded problems, such as robust control and interactive manipulation tasks. However, the asymmetric nature of these tasks makes the generation of sophisticated policies challenging. Indeed, the asymmetry in the environment may implicitly or explicitly provide an advantage to a subset of agents which could, in turn, lead to a low-quality equilibrium. This paper proposes a novel game-theoretic algorithm, Stackelberg Multi-Agent Deep Deterministic Policy Gradient (ST-MADDPG), which formulates a two-player MARL problem as a Stackelberg game with one player as the `leader' and the other as the `follower' in a hierarchical interaction structure wherein the leader has an advantage. We first demonstrate that the leader's advantage from ST-MADDPG can be used to alleviate the inherent asymmetry in the environment. By exploiting the leader's advantage, ST-MADDPG improves the quality of a co-evolution process and results in more sophisticated and complex strategies that work well even against an unseen strong opponent.

READ FULL TEXT

page 2

page 7

research
11/23/2020

Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

This paper introduces an information-theoretic constraint on learned pol...
research
04/20/2023

Mastering Asymmetrical Multiplayer Game with Multi-Agent Asymmetric-Evolution Reinforcement Learning

Asymmetrical multiplayer (AMP) game is a popular game genre which involv...
research
03/06/2023

MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning

Open-ended learning methods that automatically generate a curriculum of ...
research
08/31/2017

Optimal Distributed Control of Multi-agent Systems in Contested Environments via Reinforcement Learning

This paper presents a model-free reinforcement learning (RL) based distr...
research
02/07/2023

Uncoupled Learning of Differential Stackelberg Equilibria with Commitments

A natural solution concept for many multiagent settings is the Stackelbe...
research
05/28/2019

Robust Commitments and Partial Reputation

Agents rarely act in isolation -- their behavioral history, in particula...
research
04/16/2014

Partially Observed, Multi-objective Markov Games

The intent of this research is to generate a set of non-dominated polici...

Please sign up or login with your details

Forgot password? Click here to reset