Decentralized MCTS via Learned Teammate Models

03/19/2020
by   Aleksander Czechowski, et al.
0

A key difficulty of cooperative decentralized planning lies in making accurate predictions about the decisions of other agents. In this paper we present a policy improvement operator for learning to plan in iterated cooperative multi-agent scenarios. At each application of our method, a selected agent learns an approximation of policies of its teammates from data from past simulations. Under the assumption of ideal function approximation, successive iterations of our algorithm are guaranteed to improve the policies, and eventually lead to convergence to a Nash equilibrium in a coordinate ascent manner. We combine the policy improvement operator with the decentralized Monte Carlo Tree Search planning method and demonstrate the application of the algorithm on several scenarios in the spatial task allocation problem introduced in (Claes et al., 2015). We show that deep learning and convolutional neural networks can be efficiently employed to produce policy approximators which exploit the spatial features of the problem, and that the proposed algorithm improves over the baseline planning performance for particularly challenging domain configurations.

READ FULL TEXT
research
01/25/2019

Distributed Policy Iteration for Scalable Approximation of Cooperative Multi-Agent Policies

Decision making in multi-agent systems (MAS) is a great challenge due to...
research
07/25/2018

Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search

Today's automated vehicles lack the ability to cooperate implicitly with...
research
02/02/2023

Best Possible Q-Learning

Fully decentralized learning, where the global information, i.e., the ac...
research
11/28/2019

Option-critic in cooperative multi-agent systems

In this paper, we investigate learning temporal abstractions in cooperat...
research
10/17/2018

Multi-Agent Fully Decentralized Off-Policy Learning with Linear Convergence Rates

In this paper we develop a fully decentralized algorithm for policy eval...
research
10/26/2019

Decentralized Cooperative Communication-less Multi-Agent Task Assignment with Monte-Carlo Tree Search

Cooperative task assignment is an important subject in multi-agent syste...
research
06/07/2023

Policy-Based Self-Competition for Planning Problems

AlphaZero-type algorithms may stop improving on single-player tasks in c...

Please sign up or login with your details

Forgot password? Click here to reset