Developing cooperative policies for multi-stage reinforcement learning tasks

05/11/2022
by   Jordan Erskine, et al.
1

Many hierarchical reinforcement learning algorithms utilise a series of independent skills as a basis to solve tasks at a higher level of reasoning. These algorithms don't consider the value of using skills that are cooperative instead of independent. This paper proposes the Cooperative Consecutive Policies (CCP) method of enabling consecutive agents to cooperatively solve long time horizon multi-stage tasks. This method is achieved by modifying the policy of each agent to maximise both the current and next agent's critic. Cooperatively maximising critics allows each agent to take actions that are beneficial for its task as well as subsequent tasks. Using this method in a multi-room maze domain and a peg in hole manipulation domain, the cooperative policies were able to outperform a set of naive policies, a single agent trained across the entire domain, as well as another sequential HRL algorithm.

READ FULL TEXT

page 1

page 5

research
07/01/2020

Developing cooperative policies for multi-stage tasks

This paper proposes the Cooperative Soft Actor Critic (CSAC) method of e...
research
11/09/2022

Solving Collaborative Dec-POMDPs with Deep Reinforcement Learning Heuristics

WQMIX, QMIX, QTRAN, and VDN are SOTA algorithms for Dec-POMDP. All of th...
research
02/01/2022

Planner-Reasoner Inside a Multi-task Reasoning Agent

We consider the problem of multi-task reasoning (MTR), where an agent ca...
research
11/05/2021

Learning to Cooperate with Unseen Agent via Meta-Reinforcement Learning

Ad hoc teamwork problem describes situations where an agent has to coope...
research
01/12/2012

Sparse Reward Processes

We introduce a class of learning problems where the agent is presented w...
research
09/18/2018

SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions

Although many reinforcement learning methods have been proposed for lear...
research
12/05/2019

Inter-Level Cooperation in Hierarchical Reinforcement Learning

This article presents a novel algorithm for promoting cooperation betwee...

Please sign up or login with your details

Forgot password? Click here to reset