High-dimensional Contextual Bandit Problem without Sparsity

06/19/2023
by   Junpei Komiyama, et al.
0

In this research, we investigate the high-dimensional linear contextual bandit problem where the number of features p is greater than the budget T, or it may even be infinite. Differing from the majority of previous works in this field, we do not impose sparsity on the regression coefficients. Instead, we rely on recent findings on overparameterized models, which enables us to analyze the performance the minimum-norm interpolating estimator when data distributions have small effective ranks. We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance. Through our analysis, we derive the optimal rate of the ETC algorithm in terms of T and show that this rate can be achieved by balancing exploration and exploitation. Moreover, we introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance. We assess the performance of the proposed algorithms through a series of simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2022

Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits

We consider the stochastic linear contextual bandit problem with high-di...
research
04/28/2017

Exploiting the Natural Exploration In Contextual Bandits

The contextual bandit literature has traditionally focused on algorithms...
research
03/20/2019

Contextual Bandits with Random Projection

Contextual bandits with linear payoffs, which are also known as linear b...
research
02/03/2019

A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free

We propose the first contextual bandit algorithm that is parameter-free,...
research
09/04/2020

Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection

We consider the stochastic contextual bandit problem under the high dime...
research
01/10/2018

A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem

Bandit learning is characterized by the tension between long-term explor...
research
09/17/2022

Advertising Media and Target Audience Optimization via High-dimensional Bandits

We present a data-driven algorithm that advertisers can use to automate ...

Please sign up or login with your details

Forgot password? Click here to reset