Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks Game Theory

12/08/2020
by   Griffin Adams, et al.
0

We address two major challenges of implicit coordination in multi-agent deep reinforcement learning: non-stationarity and exponential growth of state-action space, by combining Deep-Q Networks for policy learning with Nash equilibrium for action selection. Q-values proxy as payoffs in Nash settings, and mutual best responses define joint action selection. Coordination is implicit because multiple/no Nash equilibria are resolved deterministically. We demonstrate that knowledge of game type leads to an assumption of mirrored best responses and faster convergence than Nash-Q. Specifically, the Friend-or-Foe algorithm demonstrates signs of convergence to a Set Controller which jointly chooses actions for two agents. This encouraging given the highly unstable nature of decentralized coordination over joint actions. Inspired by the dueling network architecture, which decouples the Q-function into state and advantage streams, as well as residual networks, we learn both a single and joint agent representation, and merge them via element-wise addition. This simplifies coordination by recasting it is as learning a residual function. We also draw high level comparative insights on key MADRL and game theoretic variables: competitive vs. cooperative, asynchronous vs. parallel learning, greedy versus socially optimal Nash equilibria tie breaking, and strategies for the no Nash equilibrium case. We evaluate on 3 custom environments written in Python using OpenAI Gym: a Predator Prey environment, an alternating Warehouse environment, and a Synchronization environment. Each environment requires successively more coordination to achieve positive rewards.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/08/2019

Bi-level Actor-Critic for Multi-agent Coordination

Coordination is one of the essential problems in multi-agent systems. Ty...
10/21/2020

On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality

In this work, we study the system of interacting non-cooperative two Q-l...
07/13/2022

Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing

We investigate Nash equilibrium learning in a competitive Markov Game (M...
07/01/2022

Average submodularity of maximizing anticoordination in network games

We consider the control of decentralized learning dynamics for agents in...
02/18/2021

Strategic bidding in freight transport using deep reinforcement learning

This paper presents a multi-agent reinforcement learning algorithm to re...
09/13/2018

Simulation-based Distributed Coordination Maximization over Networks

In various online/offline multi-agent networked environments, it is very...
07/18/2021

Distributed Planning for Serving Cooperative Tasks with Time Windows: A Game Theoretic Approach

We study distributed planning for multi-robot systems to provide optimal...