A Deep Policy Inference Q-Network for Multi-Agent Systems

12/21/2017
by   Zhang-Wei Hong, et al.
0

We present DPIQN, a deep policy inference Q-network that targets multi-agent systems composed of controllable agents, collaborators, and opponents that interact with each other. We focus on one challenging issue in such systems---modeling agents with varying strategies---and propose to employ "policy features" learned from raw observations (e.g., raw images) of collaborators and opponents by inferring their policies. DPIQN incorporates the learned policy features as a hidden vector into its own deep Q-network (DQN), such that it is able to predict better Q values for the controllable agents than the state-of-the-art deep reinforcement learning models. We further propose an enhanced version of DPIQN, called deep recurrent policy inference Q-network (DRPIQN), for handling partial observability. Both DPIQN and DRPIQN are trained by an adaptive training procedure, which adjusts the network's attention to learn the policy features and its own Q-values at different phases of the training process. We present a comprehensive analysis of DPIQN and DRPIQN, and highlight their effectiveness and generalizability in various multi-agent settings. Our models are evaluated in a classic soccer game involving both competitive and collaborative scenarios. Experimental results performed on 1 vs. 1 and 2 vs. 2 games show that DPIQN and DRPIQN demonstrate superior performance to the baseline DQN and deep recurrent Q-network (DRQN) models. We also explore scenarios in which collaborators or opponents dynamically change their policies, and show that DPIQN and DRPIQN do lead to better overall performance in terms of stability and mean scores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

Multi-Agent Deep Reinforcement Learning with Adaptive Policies

We propose a novel approach to address one aspect of the non-stationarit...
research
09/02/2021

MACRPO: Multi-Agent Cooperative Recurrent Policy Optimization

This work considers the problem of learning cooperative policies in mult...
research
12/03/2018

Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations

Multi-agent reinforcement learning systems aim to provide interacting ag...
research
12/01/2016

Playing Doom with SLAM-Augmented Deep Reinforcement Learning

A number of recent approaches to policy learning in 2D game domains have...
research
08/30/2021

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

In multi-agent reinforcement learning, the behaviors that agents learn i...
research
02/18/2014

Off-Policy General Value Functions to Represent Dynamic Role Assignments in RoboCup 3D Soccer Simulation

Collecting and maintaining accurate world knowledge in a dynamic, comple...
research
10/21/2020

Deep Q-Network-based Adaptive Alert Threshold Selection Policy for Payment Fraud Systems in Retail Banking

Machine learning models have widely been used in fraud detection systems...

Please sign up or login with your details

Forgot password? Click here to reset