Exponential Convergence Rate for the Asymptotic Optimality of Whittle Index Policy

12/16/2020
by   Nicolas Gast, et al.
0

We evaluate the performance of Whittle index policy for restless Markovian bandits, when the number of bandits grows. It is proven in [30] that this performance is asymptotically optimal if the bandits are indexable and the associated deterministic system has a global attractor fixed point. In this paper we show that, under the same conditions, the convergence rate is exponential in the number of bandits, unless the fixed point is singular (to be defined later). Our proof is based on the nature of the deterministic equation governing the stochastic system: We show that it is a piecewise affine continuous dynamical system inside the simplex of the empirical measure of the bandits. Using simulations and numerical solvers, we also investigate the cases where the conditions for the exponential rate theorem are violated, notably when attracting limit cycles appear, or when the fixed point is singular. We illustrate our theorem on a Markovian fading channel model, which has been well studied in the literature. Finally, we extend our synchronous model results to the asynchronous model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2017

A Converse to Banach's Fixed Point Theorem and its CLS Completeness

Banach's fixed point theorem for contraction maps has been widely used t...
research
01/27/2021

Convergence Analysis of Fixed Point Chance Constrained Optimal Power Flow Problems

For optimal power flow problems with chance constraints, a particularly ...
research
12/16/2021

A Closed-Form Bound on the Asymptotic Linear Convergence of Iterative Methods via Fixed Point Analysis

In many iterative optimization methods, fixed-point theory enables the a...
research
04/04/2023

On algorithmically boosting fixed-point computations

This paper is a thought experiment on exponentiating algorithms. One of ...
research
12/30/2017

Inverse Exponential Decay: Stochastic Fixed Point Equation and ARMA Models

We study solutions to the stochastic fixed point equation Xd=AX+B when t...
research
03/09/2012

Regret Bounds for Deterministic Gaussian Process Bandits

This paper analyses the problem of Gaussian process (GP) bandits with de...
research
02/09/2016

Herding as a Learning System with Edge-of-Chaos Dynamics

Herding defines a deterministic dynamical system at the edge of chaos. I...

Please sign up or login with your details

Forgot password? Click here to reset