Inducing Cooperative behaviour in Sequential-Social dilemmas through Multi-Agent Reinforcement Learning using Status-Quo Loss

01/15/2020
by   Pinkesh Badjatiya, et al.
0

In social dilemma situations, individual rationality leads to sub-optimal group outcomes. Several human engagements can be modeled as a sequential (multi-step) social dilemmas. However, in contrast to humans, Deep Reinforcement Learning agents trained to optimize individual rewards in sequential social dilemmas converge to selfish, mutually harmful behavior. We introduce a status-quo loss (SQLoss) that encourages an agent to stick to the status quo, rather than repeatedly changing its policy. We show how agents trained with SQLoss evolve cooperative behavior in several social dilemma matrix games. To work with social dilemma games that have visual input, we propose GameDistill. GameDistill uses self-supervision and clustering to automatically extract cooperative and selfish policies from a social dilemma game. We combine GameDistill and SQLoss to show how agents evolve socially desirable cooperative behavior in the Coin Game.

READ FULL TEXT
research
11/23/2021

Status-quo policy gradient in Multi-Agent Reinforcement Learning

Individual rationality, which involves maximizing expected individual re...
research
01/15/2020

Inducing Cooperation in Multi-Agent Games Through Status-Quo Loss

Social dilemma situations bring out the conflict between individual and ...
research
01/28/2021

Exploring the Impact of Tunable Agents in Sequential Social Dilemmas

When developing reinforcement learning agents, the standard approach is ...
research
04/26/2022

Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning

How have individuals of social animals in nature evolved to learn from e...
research
08/22/2022

Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Multi-agent reinforcement learning (MARL) is a powerful tool for trainin...
research
03/19/2019

Learning Reciprocity in Complex Sequential Social Dilemmas

Reciprocity is an important feature of human social interaction and unde...
research
01/26/2023

The Hazards and Benefits of Condescension in Social Learning

In a misspecified social learning setting, agents are condescending if t...

Please sign up or login with your details

Forgot password? Click here to reset