AWESOME: A General Multiagent Learning Algorithm that Converges in Self-Play and Learns a Best Response Against Stationary Opponents

07/01/2003
by   Vincent Conitzer, et al.
0

A satisfactory multiagent learning algorithm should, at a minimum, learn to play optimally against stationary opponents and converge to a Nash equilibrium in self-play. The algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action repeated games--assuming that the opponent's (mixed) strategy is observable. In this paper we present AWESOME, the first algorithm that is guaranteed to have these two properties in all repeated (finite) games. It requires only that the other players' actual actions (not their strategies) can be observed at each step. It also learns to play optimally against opponents that eventually become stationary. The basic idea behind AWESOME ( Adapt When Everybody is Stationary, Otherwise Move to Equilibrium) is to try to adapt to the others' strategies when they appear stationary, but otherwise to retreat to a precomputed equilibrium strategy. The techniques used to prove the properties of AWESOME are fundamentally different from those used for previous algorithms, and may help in analyzing other multiagent learning algorithms also.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2022

Random Initialization Solves Shapley's Fictitious Play Counterexample

In 1964 Shapley devised a family of games for which fictitious play fail...
research
06/20/2022

On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games

Learning to cooperate with other agents is challenging when those agents...
research
09/12/2018

Bayes-ToMoP: A Fast Detection and Best Response Algorithm Towards Sophisticated Opponents

Multiagent algorithms often aim to accurately predict the behaviors of o...
research
01/15/2013

Multi-agent learning using Fictitious Play and Extended Kalman Filter

Decentralised optimisation tasks are important components of multi-agent...
research
02/14/2012

Filtered Fictitious Play for Perturbed Observation Potential Games and Decentralised POMDPs

Potential games and decentralised partially observable MDPs (Dec-POMDPs)...
research
02/08/2021

Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games

We study infinite-horizon discounted two-player zero-sum Markov games, a...
research
03/21/2022

Fictitious Play with Maximin Initialization

Fictitious play has recently emerged as the most accurate scalable algor...

Please sign up or login with your details

Forgot password? Click here to reset