Intent-aware Multi-agent Reinforcement Learning

03/06/2018
by   Siyuan Qi, et al.
0

This paper proposes an intent-aware multi-agent planning framework as well as a learning algorithm. Under this framework, an agent plans in the goal space to maximize the expected utility. The planning process takes the belief of other agents' intents into consideration. Instead of formulating the learning problem as a partially observable Markov decision process (POMDP), we propose a simple but effective linear function approximation of the utility function. It is based on the observation that for humans, other people's intents will pose an influence on our utility for a goal. The proposed framework has several major advantages: i) it is computationally feasible and guaranteed to converge. ii) It can easily integrate existing intent prediction and low-level planning algorithms. iii) It does not suffer from sparse feedbacks in the action space. We experiment our algorithm in a real-world problem that is non-episodic, and the number of agents and goals can vary over time. Our algorithm is trained in a scene in which aerial robots and humans interact, and tested in a novel scene with a different environment. Experimental results show that our algorithm achieves the best performance and human-like behaviors emerge during the dynamic process.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 7

research
04/20/2022

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning

Recently, model-based agents have achieved better performance than model...
research
04/29/2019

Argus: Smartphone-enabled Human Cooperation via Multi-Agent Reinforcement Learning for Disaster Situational Awareness

Argus exploits a Multi-Agent Reinforcement Learning (MARL) framework to ...
research
02/14/2019

Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning

We pose an active perception problem where an autonomous agent actively ...
research
04/01/2012

Learning from Humans as an I-POMDP

The interactive partially observable Markov decision process (I-POMDP) i...
research
10/19/2012

Optimal Limited Contingency Planning

For a given problem, the optimal Markov policy can be considerred as a c...
research
09/12/2021

A Socially Aware Reinforcement Learning Agent for The Single Track Road Problem

We present the single track road problem. In this problem two agents fac...
research
03/03/2022

SMA-NBO: A Sequential Multi-Agent Planning with Nominal Belief-State Optimization in Target Tracking

In target tracking with mobile multi-sensor systems, sensor deployment i...

Please sign up or login with your details

Forgot password? Click here to reset