Detecting an Odd Restless Markov Arm with a Trembling Hand

05/13/2020
by   PN Karthik, et al.
0

In this paper, we consider a multi-armed bandit in which each arm is a Markov process evolving on a finite state space. The state space is common across the arms, and the arms are independent of each other. The transition probability matrix of one of the arms (the odd arm) is different from the common transition probability matrix of all the other arms. A decision maker, who knows these transition probability matrices, wishes to identify the odd arm as quickly as possible, while keeping the probability of decision error small. To do so, the decision maker collects observations from the arms by pulling the arms in a sequential manner, one at each discrete time instant. However, the decision maker has a trembling hand, and the arm that is actually pulled at any given time differs, with a small probability, from the one he intended to pull. The observation at any given time is the arm that is actually pulled and its current state. The Markov processes of the unobserved arms continue to evolve. This makes the arms restless. For the above setting, we derive the first known asymptotic lower bound on the expected stopping time, where the asymptotics is of vanishing error probability. The continued evolution of each arm adds a new dimension to the problem, leading to a family of Markov decision problems (MDPs) on a countable state space. We then stitch together certain parameterised solutions to these MDPs and obtain a sequence of strategies whose expected stopping times come arbitrarily close to the lower bound in the regime of vanishing error probability. Prior works dealt with independent and identically distributed (across time) arms and rested Markov arms, whereas our work deals with restless Markov arms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2019

Learning to Detect an Odd Markov Arm

A multi-armed bandit with finitely many arms is studied when each arm is...
research
03/29/2022

Best Arm Identification in Restless Markov Multi-Armed Bandits

We study the problem of identifying the best arm in a multi-armed bandit...
research
05/08/2021

Learning to Detect an Odd Restless Markov Arm with a Trembling Hand

This paper studies the problem of finding an anomalous arm in a multi-ar...
research
01/31/2019

A Bad Arm Existence Checking Problem

We study a bad arm existing checking problem in which a player's task is...
research
12/11/2017

Optimal Odd Arm Identification with Fixed Confidence

The problem of detecting an odd arm from a set of K arms of a multi-arme...
research
07/25/2020

Sequential Multi-hypothesis Testing in Multi-armed Bandit Problems:An Approach for Asymptotic Optimality

We consider a multi-hypothesis testing problem involving a K-armed bandi...
research
06/06/2021

PAC Best Arm Identification Under a Deadline

We study (ϵ, δ)-PAC best arm identification, where a decision-maker must...

Please sign up or login with your details

Forgot password? Click here to reset