Approximate Multi-Agent Fitted Q Iteration

04/19/2021
by   Antoine Lesage-Landry, et al.
0

We formulate an efficient approximation for multi-agent batch reinforcement learning, the approximate multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, standard Q-function. In each iteration and policy evaluation, AMAFQI requires a number of computations that scales linearly with the number of agents whereas the analogous number of computations increase exponentially for the fitted Q iteration (FQI), one of the most commonly used approaches in batch reinforcement learning. This property of AMAFQI is fundamental for the design of a tractable multi-agent approach. We evaluate the performance of AMAFQI and compare it to FQI in numerical simulations. Numerical examples illustrate the significant computation time reduction when using AMAFQI instead of FQI in multi-agent problems and corroborate the similar decision-making performance of both approaches.

READ FULL TEXT
research
01/25/2019

Distributed Policy Iteration for Scalable Approximation of Cooperative Multi-Agent Policies

Decision making in multi-agent systems (MAS) is a great challenge due to...
research
05/11/2022

Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning for incomplete information environmen...
research
11/30/2022

Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning

We study a multi-agent reinforcement learning (MARL) problem where the a...
research
09/30/2019

Multiagent Rollout Algorithms and Reinforcement Learning

We consider finite and infinite horizon dynamic programming problems, wh...
research
02/23/2023

Concept Learning for Interpretable Multi-Agent Reinforcement Learning

Multi-agent robotic systems are increasingly operating in real-world env...
research
11/18/2019

Dynamic exploration of multi-agent systems with timed periodic tasks

We formalise and study multi-agent timed models MAPTs (Multi-Agent with ...
research
01/30/2013

Flexible and Approximate Computation through State-Space Reduction

In the real world, insufficient information, limited computation resourc...

Please sign up or login with your details

Forgot password? Click here to reset