Multiagent Rollout Algorithms and Reinforcement Learning

09/30/2019
by   Dimitri Bertsekas, et al.
32

We consider finite and infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. We introduce an algorithm, whereby at every stage, each agent's decision is made by executing a local rollout algorithm that uses a base policy, together with some coordinating information from the other agents. The amount of local computation required at every stage by each agent is independent of the number of agents, while the amount of global computation (over all agents) grows linearly with the number of agents. By contrast, with the standard rollout algorithm, the amount of global computation grows exponentially with the number of agents. Despite the drastic reduction in required computation, we show that our algorithm has the fundamental cost improvement property of rollout: an improved performance relative to the base policy. We also explore related reinforcement learning and approximate policy iteration algorithms, and we discuss how this cost improvement property is affected when we attempt to improve further the method's computational efficiency through parallelization of the agents' computations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2020

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning

We consider infinite horizon dynamic programming problems, where the con...
research
05/24/2023

Distributed Online Rollout for Multivehicle Routing in Unmapped Environments

In this work we consider a generalization of the well-known multivehicle...
research
02/18/2020

Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm

We consider an extension of the rollout algorithm that applies to constr...
research
11/09/2020

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

In this paper we consider infinite horizon discounted dynamic programmin...
research
04/19/2021

Approximate Multi-Agent Fitted Q Iteration

We formulate an efficient approximation for multi-agent batch reinforcem...
research
10/13/2019

Global-Local Metamodel Assisted Two-Stage Optimization via Simulation

To integrate strategic, tactical and operational decisions, the two-stag...

Please sign up or login with your details

Forgot password? Click here to reset