Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

03/08/2021
by   Aditya Mate, et al.
20

Restless Multi-Armed Bandits (RMABs) have been popularly used to model limited resource allocation problems. Recently, these have been employed for health monitoring and intervention planning problems. However, the existing approaches fail to account for the arrival of new patients and the departure of enrolled patients from a treatment program. To address this challenge, we formulate a streaming bandit (S-RMAB) framework, a generalization of RMABs where heterogeneous arms arrive and leave under possibly random streams. We propose a new and scalable approach to computing index-based solutions. We start by proving that index values decrease for short residual lifetimes, a phenomenon that we call index decay. We then provide algorithms designed to capture index decay without having to solve the costly finite horizon problem, thereby lowering the computational complexity compared to existing methods.We evaluate our approach via simulations run on real-world data obtained from a tuberculosis intervention planning task as well as multiple other synthetic domains. Our algorithms achieve an over 150x speed-up over existing methods in these tasks without loss in performance. These findings are robust across multiple domains.

READ FULL TEXT

page 7

page 11

research
04/30/2023

Indexability of Finite State Restless Multi-Armed Bandit and Rollout Policy

We consider finite state restless multi-armed bandit problem. The decisi...
research
11/18/2015

Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits

I analyse the frequentist regret of the famous Gittins index strategy fo...
research
07/23/2021

Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits

We consider nonstationary multi-armed bandit problems where the model pa...
research
03/01/2023

Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks

Motivated by applications such as machine repair, project monitoring, an...
research
05/22/2023

Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare

The success of many healthcare programs depends on participants' adheren...
research
10/01/2022

Speed Up the Cold-Start Learning in Two-Sided Bandits with Many Arms

Multi-armed bandit (MAB) algorithms are efficient approaches to reduce t...
research
05/17/2021

Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare

In many public health settings, it is important for patients to adhere t...

Please sign up or login with your details

Forgot password? Click here to reset