Do deep reinforcement learning agents model intentions?

05/15/2018
by   Tambet Matiisen, et al.
0

Inferring other agents' mental states such as their knowledge, beliefs and intentions is thought to be essential for effective interactions with other agents. Recently, multiagent systems trained via deep reinforcement learning have been shown to succeed in solving different tasks, but it remains unclear how each agent modeled or represented other agents in their environment. In this work we test whether deep reinforcement learning agents explicitly represent other agents' intentions (their specific aims or goals) during a task in which the agents had to coordinate the covering of different spots in a 2D environment. In particular, we tracked over time the performance of a linear decoder trained to predict the final goal of all agents from the hidden state of each agent's neural network controller. We observed that the hidden layers of agents represented explicit information about other agents' goals, i.e. the target landmark they ended up covering. We also performed a series of experiments, in which some agents were replaced by others with fixed goals, to test the level of generalization of the trained agents. We noticed that during the training phase the agents developed a differential preference for each goal, which hindered generalization. To alleviate the above problem, we propose simple changes to the MADDPG training algorithm which leads to better generalization against unseen agents. We believe that training protocols promoting more active intention reading mechanisms, e.g. by preventing simple symmetry-breaking solutions, is a promising direction towards achieving a more robust generalization in different cooperative and competitive tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2021

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

Exploration is critical for good results in deep reinforcement learning ...
research
06/14/2021

Poisoning Deep Reinforcement Learning Agents with In-Distribution Triggers

In this paper, we propose a new data poisoning attack and apply it to de...
research
07/20/2021

Learning Altruistic Behaviours in Reinforcement Learning without External Rewards

Can artificial agents learn to assist others in achieving their goals wi...
research
05/25/2017

Cross-Domain Perceptual Reward Functions

In reinforcement learning, we often define goals by specifying rewards w...
research
03/10/2020

Explore and Exploit with Heterotic Line Bundle Models

We use deep reinforcement learning to explore a class of heterotic SU(5)...
research
08/25/2020

Auxiliary-task Based Deep Reinforcement Learning for Participant Selection Problem in Mobile Crowdsourcing

In mobile crowdsourcing (MCS), the platform selects participants to comp...
research
12/13/2022

Improving generalization in reinforcement learning through forked agents

An eco-system of agents each having their own policy with some, but limi...

Please sign up or login with your details

Forgot password? Click here to reset