Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential Decision-Making in Multi-Agent Reinforcement Learning

04/20/2023
by   Bin Zhang, et al.
0

In multi-agent reinforcement learning (MARL), self-interested agents attempt to establish equilibrium and achieve coordination depending on game structure. However, existing MARL approaches are mostly bound by the simultaneous actions of all agents in the Markov game (MG) framework, and few works consider the formation of equilibrium strategies via asynchronous action coordination. In view of the advantages of Stackelberg equilibrium (SE) over Nash equilibrium, we construct a spatio-temporal sequential decision-making structure derived from the MG and propose an N-level policy model based on a conditional hypernetwork shared by all agents. This approach allows for asymmetric training with symmetric execution, with each agent responding optimally conditioned on the decisions made by superior agents. Agents can learn heterogeneous SE policies while still maintaining parameter sharing, which leads to reduced cost for learning and storage and enhanced scalability as the number of agents increases. Experiments demonstrate that our method effectively converges to the SE policies in repeated matrix game scenarios, and performs admirably in immensely complex settings including cooperative tasks and mixed tasks.

READ FULL TEXT
research
05/13/2023

Stackelberg Decision Transformer for Asynchronous Action Coordination in Multi-Agent Systems

Asynchronous action coordination presents a pervasive challenge in Multi...
research
09/08/2019

Bi-level Actor-Critic for Multi-agent Coordination

Coordination is one of the essential problems in multi-agent systems. Ty...
research
07/11/2023

On Imperfect Recall in Multi-Agent Influence Diagrams

Multi-agent influence diagrams (MAIDs) are a popular game-theoretic mode...
research
10/18/2020

Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

Game theoretic views of convention generally rest on notions of common k...
research
03/27/2021

Dynamic Information Sharing and Punishment Strategies

In this paper we study the problem of information sharing among rational...
research
09/26/2022

Multi-Agent Sequential Decision-Making via Communication

Communication helps agents to obtain information about others so that be...
research
07/05/2020

Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions

This paper seeks to establish a framework for directing a society of sim...

Please sign up or login with your details

Forgot password? Click here to reset