Robust Restless Bandits: Tackling Interval Uncertainty with Deep Reinforcement Learning

07/04/2021
by   Jackson A. Killian, et al.
20

We introduce Robust Restless Bandits, a challenging generalization of restless multi-arm bandits (RMAB). RMABs have been widely studied for intervention planning with limited resources. However, most works make the unrealistic assumption that the transition dynamics are known perfectly, restricting the applicability of existing methods to real-world scenarios. To make RMABs more useful in settings with uncertain dynamics: (i) We introduce the Robust RMAB problem and develop solutions for a minimax regret objective when transitions are given by interval uncertainties; (ii) We develop a double oracle algorithm for solving Robust RMABs and demonstrate its effectiveness on three experimental domains; (iii) To enable our double oracle approach, we introduce RMABPPO, a novel deep reinforcement learning algorithm for solving RMABs. RMABPPO hinges on learning an auxiliary "λ-network" that allows each arm's learning to decouple, greatly reducing sample complexity required for training; (iv) Under minimax regret, the adversary in the double oracle approach is notoriously difficult to implement due to non-stationarity. To address this, we formulate the adversary oracle as a multi-agent reinforcement learning problem and solve it with a multi-agent extension of RMABPPO, which may be of independent interest as the first known algorithm for this setting. Code is available at https://github.com/killian-34/RobustRMAB.

READ FULL TEXT
research
10/07/2020

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

In the classical multi-armed bandit problem, instance-dependent algorith...
research
05/30/2022

Optimistic Whittle Index Policy: Online Learning for Restless Bandits

Restless multi-armed bandits (RMABs) extend multi-armed bandits to allow...
research
08/11/2019

A Review of Cooperative Multi-Agent Deep Reinforcement Learning

Deep Reinforcement Learning has made significant progress in multi-agent...
research
07/30/2023

Robust Multi-Agent Reinforcement Learning with State Uncertainty

In real-world multi-agent reinforcement learning (MARL) applications, ag...
research
07/05/2021

Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

Ensemble and auxiliary tasks are both well known to improve the performa...
research
01/11/2021

Solving Common-Payoff Games with Approximate Policy Iteration

For artificially intelligent learning systems to have widespread applica...
research
12/01/2016

Robust Optimization for Tree-Structured Stochastic Network Design

Stochastic network design is a general framework for optimizing network ...

Please sign up or login with your details

Forgot password? Click here to reset