Graph Feedback via Reduction to Regression

02/17/2023
by   Paul Mineiro, et al.
0

When feedback is partial, leveraging all available information is critical to minimizing data requirements. Graph feedback, which interpolates between the supervised and bandit regimes, has been extensively studied; but the mature theory is grounded in impractical algorithms. We present and analyze an approach to contextual bandits with graph feedback based upon reduction to regression. The resulting algorithms are practical and achieve known minimax rates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2020

Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

A fundamental challenge in contextual bandits is to develop flexible, ge...
research
07/05/2018

Contextual Bandits under Delayed Feedback

Delayed feedback is an ubiquitous problem in many industrial systems emp...
research
01/02/2019

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

We investigate the feasibility of learning from both fully-labeled super...
research
02/27/2017

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

We investigate contextual online learning with nonparametric (Lipschitz)...
research
02/07/2023

Leveraging User-Triggered Supervision in Contextual Bandits

We study contextual bandit (CB) problems, where the user can sometimes r...
research
12/19/2016

Corralling a Band of Bandit Algorithms

We study the problem of combining multiple bandit algorithms (that is, o...
research
01/16/2017

Thompson Sampling For Stochastic Bandits with Graph Feedback

We present a novel extension of Thompson Sampling for stochastic sequent...

Please sign up or login with your details

Forgot password? Click here to reset