Effective Dimension in Bandit Problems under Censorship

02/14/2023
by   Gauthier Guinet, et al.
0

In this paper, we study both multi-armed and contextual bandit problems in censored environments. Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments. Our main contributions include the introduction of a broad class of censorship models and their analysis in terms of the effective dimension of the problem – a natural measure of its underlying statistical complexity and main driver of the regret bound. In particular, the effective dimension allows us to maintain the structure of the original problem at first order, while embedding it in a bigger space, and thus naturally leads to results analogous to uncensored settings. Our analysis involves a continuous generalization of the Elliptical Potential Inequality, which we believe is of independent interest. We also discover an interesting property of decision-making under censorship: a transient phase during which initial misspecification of censorship is self-corrected at an extra cost, followed by a stationary phase that reflects the inherent slowdown of learning governed by the effective dimension. Our results are useful for applications of sequential decision-making models where the feedback received depends on strategic uncertainty (e.g., agents' willingness to follow a recommendation) and/or random uncertainty (e.g., loss or delay in arrival of information).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2023

Strategic Apple Tasting

Algorithmic decision-making in high-stakes domains often involves assign...
research
05/04/2019

Tight Regret Bounds for Infinite-armed Linear Contextual Bandits

Linear contextual bandit is a class of sequential decision making proble...
research
12/27/2021

The Statistical Complexity of Interactive Decision Making

A fundamental challenge in interactive learning and decision making, ran...
research
08/28/2021

Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

In this paper, we study endogeneity problems in algorithmic decision-mak...
research
05/27/2022

Private and Byzantine-Proof Cooperative Decision-Making

The cooperative bandit problem is a multi-agent decision problem involvi...
research
02/05/2015

RELEAF: An Algorithm for Learning and Exploiting Relevance

Recommender systems, medical diagnosis, network security, etc., require ...

Please sign up or login with your details

Forgot password? Click here to reset