Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition

05/18/2021
by   Bo Liu, et al.
1

In real-world multiagent systems, agents with different capabilities may join or leave without altering the team's overarching goals. Coordinating teams with such dynamic composition is challenging: the optimal team strategy varies with the composition. We propose COPA, a coach-player framework to tackle this problem. We assume the coach has a global view of the environment and coordinates the players, who only have partial views, by distributing individual strategies. Specifically, we 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players. We validate our methods on a resource collection task, a rescue game, and the StarCraft micromanagement tasks. We demonstrate zero-shot generalization to new team compositions. Our method achieves comparable or better performance than the setting where all players have a full view of the environment. Moreover, we see that the performance remains high even when the coach communicates as little as 13 the time using the adaptive communication strategy.

READ FULL TEXT
research
10/26/2020

CRICTRS: Embeddings based Statistical and Semi Supervised Cricket Team Recommendation System

Team Recommendation has always been a challenging aspect in team sports....
research
04/15/2021

Contrastive Learning for Sports Video: Unsupervised Player Classification

We address the problem of unsupervised classification of players in a te...
research
09/19/2019

Time Series Modeling for Dream Team in Fantasy Premier League

The performance of football players in English Premier League varies lar...
research
09/29/2021

Untangling Braids with Multi-agent Q-Learning

We use reinforcement learning to tackle the problem of untangling braids...
research
02/28/2020

Player Chemistry: Striving for a Perfectly Balanced Soccer Team

Soccer scouts typically ignore the team balance and team chemistry when ...
research
01/04/2022

Mechanism Design with Informational Punishment

We introduce informational punishment to the design of mechanisms that c...
research
12/22/2019

Modelling basketball players' performance and interactions between teammates with a regime switching approach

Basketball players' performance measurement is of critical importance fo...

Please sign up or login with your details

Forgot password? Click here to reset