SIPOMDPLite-Net: Lightweight, Self-Interested Learning and Planning in POSGs with Sparse Interactions

02/22/2022
by   Gengyu Zhang, et al.
0

This work introduces sIPOMDPLite-net, a deep neural network (DNN) architecture for decentralized, self-interested agent control in partially observable stochastic games (POSGs) with sparse interactions between agents. The network learns to plan in contexts modeled by the interactive partially observable Markov decision process (I-POMDP) Lite framework and uses hierarchical value iteration networks to simulate the solution of nested MDPs, which I-POMDP Lite attributes to the other agent to model its behavior and predict its intention. We train sIPOMDPLite-net with expert demonstrations on small two-agent Tiger-grid tasks, for which it accurately learns the underlying I-POMDP Lite model and near-optimal policy, and the policy continues to perform well on larger grids and real-world maps. As such, sIPOMDPLite-net shows good transfer capabilities and offers a lighter learning and planning approach for individual, self-interested agents in multiagent settings.

READ FULL TEXT
research
08/26/2020

Reputation-driven Decision-making in Networks of Stochastic Agents

This paper studies multi-agent systems that involve networks of self-int...
research
03/24/2015

Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs

Interactive partially observable Markov decision processes (I-POMDP) pro...
research
03/11/2019

Deep Recurrent Q-Learning vs Deep Q-Learning on a simple Partially Observable Markov Decision Process with Minecraft

Deep Q-Learning has been successfully applied to a wide variety of tasks...
research
04/18/2013

Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents

A key challenge in non-cooperative multi-agent systems is that of develo...
research
09/12/2016

DESPOT: Online POMDP Planning with Regularization

The partially observable Markov decision process (POMDP) provides a prin...
research
05/14/2018

Maximizing Expected Impact in an Agent Reputation Network -- Technical Report

Many multi-agent systems (MASs) are situated in stochastic environments....
research
08/13/2019

Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics

Continuous control and planning remains a major challenge in robotics an...

Please sign up or login with your details

Forgot password? Click here to reset