Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination

10/21/2019
by   Dongge Han, et al.
17

In a multi-agent system, an agent's optimal policy will typically depend on the policies chosen by others. Therefore, a key issue in multi-agent systems research is that of predicting the behaviours of others, and responding promptly to changes in such behaviours. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical reinforcement learning framework. However, this approach results in inflexibility of agents if options have an extended duration and are dynamic. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcast intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options. We evaluate our model empirically on a set of multi-agent pursuit and taxi tasks, and show that our agents learn to adapt flexibly across scenarios that require different termination behaviours.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2022

Multi-Agent Asynchronous Cooperation with Hierarchical Reinforcement Learning

Hierarchical multi-agent reinforcement learning (MARL) has shown a signi...
research
10/07/2022

Multi-agent Deep Covering Option Discovery

The use of options can greatly accelerate exploration in reinforcement l...
research
11/10/2017

Learning with Options that Terminate Off-Policy

A temporally abstract action, or an option, is specified by a policy and...
research
01/20/2022

Multi-agent Covering Option Discovery based on Kronecker Product of Factor Graphs

Covering option discovery has been developed to improve the exploration ...
research
06/04/2019

Options as responses: Grounding behavioural hierarchies in multi-agent RL

We propose a novel hierarchical agent architecture for multi-agent reinf...
research
06/29/2022

Breaking indecision in multi-agent, multi-option dynamics

How does a group of agents break indecision when deciding about options ...
research
12/06/2022

Variable-Decision Frequency Option Critic

In classic reinforcement learning algorithms, agents make decisions at d...

Please sign up or login with your details

Forgot password? Click here to reset