Model-Based Opponent Modeling

08/04/2021
by   Xiaopeng Yu, et al.
0

When one agent interacts with a multi-agent environment, it is challenging to deal with various opponents unseen before. Modeling the behaviors, goals, or beliefs of opponents could help the agent adjust its policy to adapt to different opponents. In addition, it is also important to consider opponents who are learning simultaneously or capable of reasoning. However, existing work usually tackles only one of the aforementioned types of opponent. In this paper, we propose model-based opponent modeling (MBOM), which employs the environment model to adapt to all kinds of opponent. MBOM simulates the recursive reasoning process in the environment model and imagines a set of improving opponent policies. To effectively and accurately represent the opponent policy, MBOM further mixes the imagined opponent policies according to the similarity with the real behaviors of opponents. Empirically, we show that MBOM achieves more effective adaptation than existing methods in competitive and cooperative environments, respectively with different types of opponent, i.e., fixed policy, naïve learner, and reasoning learner.

READ FULL TEXT

page 4

page 6

research
06/10/2021

Informative Policy Representations in Multi-Agent Reinforcement Learning via Joint-Action Distributions

In multi-agent reinforcement learning, the inherent non-stationarity of ...
research
09/12/2018

Bayes-ToMoP: A Fast Detection and Best Response Algorithm Towards Sophisticated Opponents

Multiagent algorithms often aim to accurately predict the behaviors of o...
research
11/21/2019

Agent Probing Interaction Policies

Reinforcement learning in a multi agent system is difficult because thes...
research
01/26/2019

Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning

Humans are capable of attributing latent mental contents such as beliefs...
research
12/03/2019

BADGER: Learning to (Learn [Learning Algorithms] through Multi-Agent Communication)

In this work, we propose a novel memory-based multi-agent meta-learning ...
research
02/16/2023

Model-Based Decentralized Policy Optimization

Decentralized policy optimization has been commonly used in cooperative ...
research
04/20/2018

Delegating via Quitting Games

Delegation allows an agent to request that another agent completes a tas...

Please sign up or login with your details

Forgot password? Click here to reset