NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL

10/05/2021
by   Khaled Nakhleh, et al.
0

Whittle index policy is a powerful tool to obtain asymptotically optimal solutions for the notoriously intractable problem of restless bandits. However, finding the Whittle indices remains a difficult problem for many practical restless bandits with convoluted transition kernels. This paper proposes NeurWIN, a neural Whittle index network that seeks to learn the Whittle indices for any restless bandits by leveraging mathematical properties of the Whittle indices. We show that a neural network that produces the Whittle index is also one that produces the optimal control for a set of Markov decision problems. This property motivates using deep reinforcement learning for the training of NeurWIN. We demonstrate the utility of NeurWIN by evaluating its performance for three recently studied restless bandit problems. Our experiment results show that the performance of NeurWIN is significantly better than other RL algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2019

On some new neighbourhood degree based indices

In this paper, four novel topological indices named as neighbourhood ver...
research
07/25/2020

Simulation Based Algorithms for Markov Decision Processes and Multi-Action Restless Bandits

We consider multi-dimensional Markov decision processes and formulate a ...
research
04/07/2023

Full Gradient Deep Reinforcement Learning for Average-Reward Criterion

We extend the provably convergent Full Gradient DQN algorithm for discou...
research
01/31/2019

Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning

In this paper, we present a new class of Markov decision processes (MDPs...
research
03/10/2022

Computing Whittle (and Gittins) Index in Subcubic Time

Whittle index is a generalization of Gittins index that provides very ef...
research
04/29/2020

Whittle index based Q-learning for restless bandits with average reward

A novel reinforcement learning algorithm is introduced for multiarmed re...
research
06/01/2022

The statistical nature of h-index of a network node

Evaluating the importance of a network node is a crucial task in network...

Please sign up or login with your details

Forgot password? Click here to reset