Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games

10/10/2021
by   Patrick Phillips, et al.
0

Two player zero sum simultaneous action games are common in video games, financial markets, war, business competition, and many other settings. We first introduce the fundamental concepts of reinforcement learning in two player zero sum simultaneous action games and discuss the unique challenges this type of game poses. Then we introduce two novel agents that attempt to handle these challenges by using joint action Deep Q-Networks (DQN). The first agent, called the Best Response AgenT (BRAT), builds an explicit model of its opponent's policy using imitation learning, and then uses this model to find the best response to exploit the opponent's strategy. The second agent, Meta-Nash DQN, builds an implicit model of its opponent's policy in order to produce a context variable that is used as part of the Q-value calculation. An explicit minimax over Q-values is used to find actions close to Nash equilibrium. We find empirically that both agents converge to Nash equilibrium in a self-play setting for simple matrix games, while also performing well in games with larger state and action spaces. These novel algorithms are evaluated against vanilla RL algorithms as well as recent state of the art multi-agent and two agent algorithms. This work combines ideas from traditional reinforcement learning, game theory, and meta learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2019

ColosseumRL: A Framework for Multiagent Reinforcement Learning in N-Player Games

Much of recent success in multiagent reinforcement learning has been in ...
research
06/12/2022

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Algorithms designed for single-agent reinforcement learning (RL) general...
research
06/08/2020

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Recent advances in deep reinforcement learning (RL) have led to consider...
research
09/13/2020

Efficient Competitive Self-Play Policy Optimization

Reinforcement learning from self-play has recently reported many success...
research
03/15/2012

Automated Planning in Repeated Adversarial Games

Game theory's prescriptive power typically relies on full rationality an...
research
12/08/2020

Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks Game Theory

We address two major challenges of implicit coordination in multi-agent ...
research
02/27/2020

Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games

Zero-sum games have long guided artificial intelligence research, since ...

Please sign up or login with your details

Forgot password? Click here to reset