Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach

03/01/2018
by   Weixun Wang, et al.
0

The Iterated Prisoner's Dilemma has guided research on social dilemmas for decades. However, it distinguishes between only two atomic actions: cooperate and defect. In real-world prisoner's dilemmas, these choices are temporally extended and different strategies may correspond to sequences of actions, reflecting grades of cooperation. We introduce a Sequential Prisoner's Dilemma (SPD) game to better capture the aforementioned characteristics. In this work, we propose a deep multiagent reinforcement learning approach that investigates the evolution of mutual cooperation in SPD games. Our approach consists of two phases. The first phase is offline: it synthesizes policies with different cooperation degrees and then trains a cooperation degree detection network. The second phase is online: an agent adaptively selects its policy based on the detected degree of opponent cooperation. The effectiveness of our approach is demonstrated in two representative SPD 2D games: the Apple-Pear game and the Fruit Gathering game. Experimental results show that our strategy can avoid being exploited by exploitative opponents and achieve cooperation with cooperative opponents.

READ FULL TEXT

page 4

page 6

page 11

research
02/10/2017

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Matrix games like Prisoner's Dilemma have guided research on social dile...
research
03/23/2018

Inequity aversion improves cooperation in intertemporal social dilemmas

Groups of humans are often able to find ways to cooperate with one anoth...
research
06/26/2022

Tackling Asymmetric and Circular Sequential Social Dilemmas with Reinforcement Learning and Graph-based Tit-for-Tat

In many societal and industrial interactions, participants generally pre...
research
12/23/2021

Should transparency be (in-)transparent? On monitoring aversion and cooperation in teams

Many modern organisations employ methods which involve monitoring of emp...
research
03/23/2018

Inequity aversion resolves intertemporal social dilemmas

Groups of humans are often able to find ways to cooperate with one anoth...
research
01/24/2020

Cooperative versus decentralized strategies in three-pursuer single-evader games

The value of cooperation in pursuit-evasion games is investigated. The c...
research
10/19/2017

Consequentialist conditional cooperation in social dilemmas with imperfect information

Social dilemmas, where mutual cooperation can lead to high payoffs but p...

Please sign up or login with your details

Forgot password? Click here to reset