Risk-Aware Distributed Multi-Agent Reinforcement Learning

04/04/2023
by   Abdullah Al Maruf, et al.
0

Autonomous cyber and cyber-physical systems need to perform decision-making, learning, and control in unknown environments. Such decision-making can be sensitive to multiple factors, including modeling errors, changes in costs, and impacts of events in the tails of probability distributions. Although multi-agent reinforcement learning (MARL) provides a framework for learning behaviors through repeated interactions with the environment by minimizing an average cost, it will not be adequate to overcome the above challenges. In this paper, we develop a distributed MARL approach to solve decision-making problems in unknown environments by learning risk-aware actions. We use the conditional value-at-risk (CVaR) to characterize the cost function that is being minimized, and define a Bellman operator to characterize the value function associated to a given state-action pair. We prove that this operator satisfies a contraction property, and that it converges to the optimal value function. We then propose a distributed MARL algorithm called the CVaR QD-Learning algorithm, and establish that value functions of individual agents reaches consensus. We identify several challenges that arise in the implementation of the CVaR QD-Learning algorithm, and present solutions to overcome these. We evaluate the CVaR QD-Learning algorithm through simulations, and demonstrate the effect of a risk parameter on value functions at consensus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2012

QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

The paper considers a class of multi-agent Markov decision processes (MD...
research
05/11/2020

Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments

Action and observation delays exist prevalently in the real-world cyber-...
research
01/15/2020

Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping

We present a new model-based reinforcement learning algorithm, Cooperati...
research
11/05/2019

Robo-advising: Learning Investor's Risk Preferences via Portfolio Choices

We introduce a reinforcement learning framework for retail robo-advising...
research
03/29/2021

Reinforcement Learning Beyond Expectation

The inputs and preferences of human users are important considerations i...
research
12/11/2021

Federated Reinforcement Learning at the Edge

Modern cyber-physical architectures use data collected from systems at d...
research
01/20/2022

Statistical Learning for Individualized Asset Allocation

We establish a high-dimensional statistical learning framework for indiv...

Please sign up or login with your details

Forgot password? Click here to reset