Linear Bandits with Feature Feedback

03/09/2019
by   Urvashi Oswal, et al.
0

This paper explores a new form of the linear bandit problem in which the algorithm receives the usual stochastic rewards as well as stochastic feedback about which features are relevant to the rewards, the latter feedback being the novel aspect. The focus of this paper is the development of new theory and algorithms for linear bandits with feature feedback. We show that linear bandits with feature feedback can achieve regret over time horizon T that scales like k√(T), without prior knowledge of which features are relevant nor the number k of relevant features. In comparison, the regret of traditional linear bandits is d√(T), where d is the total number of (relevant and irrelevant) features, so the improvement can be dramatic if k≪ d. The computational complexity of the new algorithm is proportional to k rather than d, making it much more suitable for real-world applications compared to traditional linear bandits. We demonstrate the performance of the new algorithm with synthetic and real human-labeled data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2019

Stochastic Bandits with Delayed Composite Anonymous Feedback

We explore a novel setting of the Multi-Armed Bandit (MAB) problem inspi...
research
03/23/2023

Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

This paper investigates the problem of combinatorial multiarmed bandits ...
research
07/29/2019

Bandits with Feedback Graphs and Switching Costs

We study the adversarial multi-armed bandit problem where partial observ...
research
06/18/2019

Simple Algorithms for Dueling Bandits

In this paper, we present simple algorithms for Dueling Bandits. We prov...
research
06/21/2019

Randomized Exploration in Generalized Linear Bandits

We study two randomized algorithms for generalized linear bandits, GLM-T...
research
02/16/2023

Linear Bandits with Memory: from Rotting to Rising

Nonstationary phenomena, such as satiation effects in recommendation, ar...
research
02/25/2019

Improved Algorithm on Online Clustering of Bandits

We generalize the setting of online clustering of bandits by allowing no...

Please sign up or login with your details

Forgot password? Click here to reset