High dimensional stochastic linear contextual bandit with missing covariates

07/22/2022
by   Byoungwook Jang, et al.
0

Recent works in bandit problems adopted lasso convergence theory in the sequential decision-making setting. Even with fully observed contexts, there are technical challenges that hinder the application of existing lasso convergence theory: 1) proving the restricted eigenvalue condition under conditionally sub-Gaussian noise and 2) accounting for the dependence between the context variables and the chosen actions. This paper studies the effect of missing covariates on regret for stochastic linear bandit algorithms. Our work provides a high-probability upper bound on the regret incurred by the proposed algorithm in terms of covariate sampling probabilities, showing that the regret degrades due to missingness by at most ζ_min^2, where ζ_min is the minimum probability of observing covariates in the context vector. We illustrate our algorithm for the practical application of experimental design for collecting gene expression data by a sequential selection of class discriminating DNA probes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2019

Doubly-Robust Lasso Bandit

Contextual multi-armed bandit algorithms are widely used in sequential d...
research
02/21/2020

Online Batch Decision-Making with High-Dimensional Covariates

We propose and investigate a class of new algorithms for sequential deci...
research
02/18/2021

A Simple Unified Framework for High Dimensional Bandit Problems

Stochastic high dimensional bandit problems with low dimensional structu...
research
02/18/2023

Online Instrumental Variable Regression: Regret Analysis and Bandit Feedback

The independence of noise and covariates is a standard assumption in onl...
research
08/28/2021

Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

In this paper, we study endogeneity problems in algorithmic decision-mak...
research
09/04/2020

Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection

We consider the stochastic contextual bandit problem under the high dime...
research
02/01/2021

Doubly Robust Thompson Sampling for linear payoffs

A challenging aspect of the bandit problem is that a stochastic reward i...

Please sign up or login with your details

Forgot password? Click here to reset