SA-IGA: A Multiagent Reinforcement Learning Method Towards Socially Optimal Outcomes

03/08/2018
by   Chengwei Zhang, et al.
0

In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment. From the system designer's perspective, it is desirable if the agents can learn to coordinate towards socially optimal outcomes, while also avoiding being exploited by selfish opponents. To this end, we propose a novel gradient ascent based algorithm (SA-IGA) which augments the basic gradient-ascent algorithm by incorporating social awareness into the policy update process. We theoretically analyze the learning dynamics of SA-IGA using dynamical system theory and SA-IGA is shown to have linear dynamics for a wide range of games including symmetric games. The learning dynamics of two representative games (the prisoner's dilemma game and the coordination game) are analyzed in details. Based on the idea of SA-IGA, we further propose a practical multiagent learning algorithm, called SA-PGA, based on Q-learning update rule. Simulation results show that SA-PGA agent can achieve higher social welfare than previous social-optimality oriented Conditional Joint Action Learner (CJAL) and also is robust against individually rational opponents by reaching Nash equilibrium solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2020

Multi-Agent Reinforcement Learning in Cournot Games

In this work, we study the interaction of strategic agents in continuous...
research
03/16/2018

A Generalised Method for Empirical Game Theoretic Analysis

This paper provides theoretical bounds for empirical game theoretical an...
research
01/23/2023

Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics

Achieving convergence of multiple learning agents in general N-player ga...
research
10/10/2019

Passive network evolution promotes group welfare in complex networks

The Parrondo's paradox is a counterintuitive phenomenon in which individ...
research
04/09/2023

Higher-Order Uncoupled Dynamics Do Not Lead to Nash Equilibrium – Except When They Do

The framework of multi-agent learning explores the dynamics of how indiv...
research
11/22/2022

Network coevolution drives segregation and enhances Pareto optimal equilibrium selection in coordination games

In this work we assess the role played by the dynamical adaptation of th...
research
03/08/2022

COLA: Consistent Learning with Opponent-Learning Awareness

Learning in general-sum games can be unstable and often leads to sociall...

Please sign up or login with your details

Forgot password? Click here to reset