A Q-values Sharing Framework for Multiagent Reinforcement Learning under Budget Constraint

11/29/2020
by   Changxi Zhu, et al.
0

In teacher-student framework, a more experienced agent (teacher) helps accelerate the learning of another agent (student) by suggesting actions to take in certain states. In cooperative multiagent reinforcement learning (MARL), where agents need to cooperate with one another, a student may fail to cooperate well with others even by following the teachers' suggested actions, as the polices of all agents are ever changing before convergence. When the number of times that agents communicate with one another is limited (i.e., there is budget constraint), the advising strategy that uses actions as advices may not be good enough. We propose a partaker-sharer advising framework (PSAF) for cooperative MARL agents learning with budget constraint. In PSAF, each Q-learner can decide when to ask for Q-values and share its Q-values. We perform experiments in three typical multiagent learning problems. Evaluation results show that our approach PSAF outperforms existing advising methods under both unlimited and limited budget, and we give an analysis of the impact of advising actions and sharing Q-values on agents' learning.

READ FULL TEXT
research
02/07/2020

Student/Teacher Advising through Reward Augmentation

Transfer learning is an important new subfield of multiagent reinforceme...
research
07/28/2017

Learning to Teach Reinforcement Learning Agents

In this article we study the transfer learning model of action advice un...
research
04/19/2019

Teaching on a Budget in Multi-Agent Deep Reinforcement Learning

Deep Reinforcement Learning algorithms can solve complex sequential deci...
research
07/01/2020

Interaction-limited Inverse Reinforcement Learning

This paper proposes an inverse reinforcement learning (IRL) framework to...
research
05/04/2023

A framework for the emergence and analysis of language in social learning agents

Artificial neural networks (ANNs) are increasingly used as research mode...
research
04/17/2021

Learning on a Budget via Teacher Imitation

Deep Reinforcement Learning (RL) techniques can benefit greatly from lev...
research
04/04/2023

Optimal Transport for Correctional Learning

The contribution of this paper is a generalized formulation of correctio...

Please sign up or login with your details

Forgot password? Click here to reset