Non-Stationary Policy Learning for Multi-Timescale Multi-Agent Reinforcement Learning

07/17/2023
by   Patrick Emami, et al.
0

In multi-timescale multi-agent reinforcement learning (MARL), agents interact across different timescales. In general, policies for time-dependent behaviors, such as those induced by multiple timescales, are non-stationary. Learning non-stationary policies is challenging and typically requires sophisticated or inefficient algorithms. Motivated by the prevalence of this control problem in real-world complex systems, we introduce a simple framework for learning non-stationary policies for multi-timescale MARL. Our approach uses available information about agent timescales to define a periodic time encoding. In detail, we theoretically demonstrate that the effects of non-stationarity introduced by multiple timescales can be learned by a periodic multi-agent policy. To learn such policies, we propose a policy gradient algorithm that parameterizes the actor and critic with phase-functioned neural networks, which provide an inductive bias for periodicity. The framework's ability to effectively learn multi-timescale policies is validated on a gridworld and building energy management environment.

READ FULL TEXT

page 1

page 6

research
11/21/2019

Agent Probing Interaction Policies

Reinforcement learning in a multi agent system is difficult because thes...
research
09/27/2019

Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals

In a multi-agent setting, the optimal policy of a single agent is largel...
research
10/31/2020

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

A fundamental challenge in multiagent reinforcement learning is to learn...
research
03/17/2017

Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability

Many real-world tasks involve multiple agents with partial observability...
research
12/21/2020

Multi-Agent Reinforcement Learning for Dynamic Ocean Monitoring by a Swarm of Buoys

Autonomous marine environmental monitoring problem traditionally encompa...
research
05/10/2023

Fast Teammate Adaptation in the Presence of Sudden Policy Change

In cooperative multi-agent reinforcement learning (MARL), where an agent...
research
08/04/2021

Offline Decentralized Multi-Agent Reinforcement Learning

In many real-world multi-agent cooperative tasks, due to high cost and r...

Please sign up or login with your details

Forgot password? Click here to reset