Bi-level Latent Variable Model for Sample-Efficient Multi-Agent Reinforcement Learning

04/12/2023
by   Aravind Venugopal, et al.
0

Despite their potential in real-world applications, multi-agent reinforcement learning (MARL) algorithms often suffer from high sample complexity. To address this issue, we present a novel model-based MARL algorithm, BiLL (Bi-Level Latent Variable Model-based Learning), that learns a bi-level latent variable model from high-dimensional inputs. At the top level, the model learns latent representations of the global state, which encode global information relevant to behavior learning. At the bottom level, it learns latent representations for each agent, given the global latent representations from the top level. The model generates latent trajectories to use for policy learning. We evaluate our algorithm on complex multi-agent tasks in the challenging SMAC and Flatland environments. Our algorithm outperforms state-of-the-art model-free and model-based baselines in sample efficiency, including on two extremely challenging Super Hard SMAC maps.

READ FULL TEXT
research
07/01/2019

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model

Deep reinforcement learning (RL) algorithms can use high-capacity deep n...
research
09/29/2020

Zero-Shot Clinical Acronym Expansion with a Hierarchical Metadata-Based Latent Variable Model

We introduce Latent Meaning Cells, a deep latent variable model which le...
research
02/16/2023

Model-Based Decentralized Policy Optimization

Decentralized policy optimization has been commonly used in cooperative ...
research
09/08/2023

Leveraging World Model Disentanglement in Value-Based Multi-Agent Reinforcement Learning

In this paper, we propose a novel model-based multi-agent reinforcement ...
research
11/12/2020

Learning Latent Representations to Influence Multi-Agent Interaction

Seamlessly interacting with humans or robots is hard because these agent...
research
09/16/2022

A Biologically-Inspired Dual Stream World Model

The medial temporal lobe (MTL), a brain region containing the hippocampu...
research
03/26/2021

Increasing the Efficiency of Policy Learning for Autonomous Vehicles by Multi-Task Representation Learning

Driving in a dynamic, multi-agent, and complex urban environment is a di...

Please sign up or login with your details

Forgot password? Click here to reset