Behaviour-conditioned policies for cooperative reinforcement learning tasks

10/04/2021
by   Antti Keurulainen, et al.
0

The cooperation among AI systems, and between AI systems and humans is becoming increasingly important. In various real-world tasks, an agent needs to cooperate with unknown partner agent types. This requires the agent to assess the behaviour of the partner agent during a cooperative task and to adjust its own policy to support the cooperation. Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning. However, adapting to a partner agent behaviour during the ongoing task requires ability to assess the partner agent type quickly. We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour, and use this data for training a meta-learner. We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability. When an agent is equipped with such a meta-learner, it is capable of quickly adapting to cooperation with unknown partner agent types in new situations. This method can be used to automatically form a task distribution for meta-training from emerging behaviours that arise, for example, through self-play.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2021

A Meta-Reinforcement Learning Approach to Process Control

Meta-learning is a branch of machine learning which aims to quickly adap...
research
10/07/2022

Robotic Control Using Model Based Meta Adaption

In machine learning, meta-learning methods aim for fast adaptability to ...
research
05/06/2020

Safe Reinforcement Learning through Meta-learned Instincts

An important goal in reinforcement learning is to create agents that can...
research
11/05/2021

Learning to Cooperate with Unseen Agent via Meta-Reinforcement Learning

Ad hoc teamwork problem describes situations where an agent has to coope...
research
02/25/2016

Meta-learning within Projective Simulation

Learning models of artificial intelligence can nowadays perform very wel...
research
02/13/2020

Sequential Cooperative Bayesian Inference

Cooperation is often implicitly assumed when learning from other agents....
research
07/27/2021

Open-Ended Learning Leads to Generally Capable Agents

In this work we create agents that can perform well beyond a single, ind...

Please sign up or login with your details

Forgot password? Click here to reset