Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping

01/15/2020
by   Eugenio Bargiacchi, et al.
0

We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our approach only requires knowledge about the structure of the problem in the form of a dynamic decision network. Using this information, our method learns a model of the environment and performs temporal difference updates which affect multiple joint states and actions at once. Batch updates are additionally performed which efficiently back-propagate knowledge throughout the factored Q-function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2022

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning

Recently, model-based agents have achieved better performance than model...
research
01/23/2019

Robust temporal difference learning for critical domains

We present a new Q-function operator for temporal difference (TD) learni...
research
02/14/2022

Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization

In the sequential decision making setting, an agent aims to achieve syst...
research
03/14/2022

Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation

We consider model-based multi-agent reinforcement learning, where the en...
research
04/04/2023

Risk-Aware Distributed Multi-Agent Reinforcement Learning

Autonomous cyber and cyber-physical systems need to perform decision-mak...
research
08/16/2020

The reinforcement learning-based multi-agent cooperative approach for the adaptive speed regulation on a metallurgical pickling line

We present a holistic data-driven approach to the problem of productivit...
research
02/16/2018

Reactive Reinforcement Learning in Asynchronous Environments

The relationship between a reinforcement learning (RL) agent and an asyn...

Please sign up or login with your details

Forgot password? Click here to reset