Selectively Contextual Bandits

05/09/2022
by   Claudia Roberts, et al.
0

Contextual bandits are widely used in industrial personalization systems. These online learning frameworks learn a treatment assignment policy in the presence of treatment effects that vary with the observed contextual features of the users. While personalization creates a rich user experience that reflect individual interests, there are benefits of a shared experience across a community that enable participation in the zeitgeist. Such benefits are emergent through network effects and are not captured in regret metrics typically employed in evaluating bandits. To balance these needs, we propose a new online learning algorithm that preserves benefits of personalization while increasing the commonality in treatments across users. Our approach selectively interpolates between a contextual bandit algorithm and a context-free multi-arm bandit and leverages the contextual information for a treatment decision only if it promises significant gains. Apart from helping users of personalization systems balance their experience between the individualized and shared, simplifying the treatment assignment policy by making it selectively reliant on the context can help improve the rate of learning in some cases. We evaluate our approach in a classification setting using public datasets and show the benefits of the hybrid policy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2020

Survey Bandits with Regret Guarantees

We consider a variant of the contextual bandit problem. In standard cont...
research
05/29/2023

Contextual Bandits with Budgeted Information Reveal

Contextual bandit algorithms are commonly used in digital health to reco...
research
06/01/2023

Explicit Feature Interaction-aware Uplift Network for Online Marketing

As a key component in online marketing, uplift modeling aims to accurate...
research
11/11/2019

A Biologically Plausible Benchmark for Contextual Bandit Algorithms in Precision Oncology Using in vitro Data

Precision oncology, the genetic sequencing of tumors to identify druggab...
research
05/19/2020

Greedy Algorithm almost Dominates in Smoothed Contextual Bandits

Online learning algorithms, widely used to power search and content opti...
research
11/22/2022

Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning

We design and implement an adaptive experiment (a “contextual bandit”) t...
research
09/01/2018

A Contextual-bandit-based Approach for Informed Decision-making in Clinical Trials

Clinical trials involving multiple treatments utilize randomization of t...

Please sign up or login with your details

Forgot password? Click here to reset