Infinite Action Contextual Bandits with Reusable Data Exhaust

02/16/2023
by   Mark Rucker, et al.
0

For infinite action contextual bandits, smoothed regret and reduction to regression results in state-of-the-art online statistical performance with computational cost independent of the action set: unfortunately, the resulting data exhaust does not have well-defined importance-weights. This frustrates the execution of downstream data science processes such as offline model selection. In this paper we describe an online algorithm with an equivalent smoothed regret guarantee, but which generates well-defined importance weights: in exchange, the online computational cost increases, but only to order smoothness (i.e., still independent of the action set). This removes a key obstacle to adoption of smoothed regret in production scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2022

Conditionally Risk-Averse Contextual Bandits

We desire to apply contextual bandits to scenarios where average-case st...
research
06/11/2021

Optimal Model Selection in Contextual Bandits with Many Classes via Offline Oracles

We study the problem of model selection for contextual bandits, in which...
research
07/12/2022

Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces

Designing efficient general-purpose contextual bandit algorithms that wo...
research
11/09/2017

Small-loss bounds for online learning with partial information

We consider the problem of adversarial (non-stochastic) online learning ...
research
07/24/2023

Anytime Model Selection in Linear Bandits

Model selection in the context of bandit optimization is a challenging p...
research
07/13/2022

Cost-Effective Online Contextual Model Selection

How can we collect the most useful labels to learn a model selection pol...
research
12/03/2021

On Submodular Contextual Bandits

We consider the problem of contextual bandits where actions are subsets ...

Please sign up or login with your details

Forgot password? Click here to reset