Hierarchical Exploration for Accelerating Contextual Bandits

06/27/2012
by   Yisong Yue, et al.
0

Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/04/2019

Toward Building Conversational Recommender Systems: A Contextual Bandit Approach

Contextual bandit algorithms have gained increasing popularity in recomm...
06/26/2022

Two-Stage Neural Contextual Bandits for Personalised News Recommendation

We consider the problem of personalised news recommendation where each u...
10/19/2021

Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

User interest exploration is an important and challenging topic in recom...
06/05/2021

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

Stochastic linear contextual bandit algorithms have substantial applicat...
09/04/2020

Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection

We consider the stochastic contextual bandit problem under the high dime...
03/31/2010

Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms

Contextual bandit algorithms have become popular for online recommendati...