Hierarchical Exploration for Accelerating Contextual Bandits

by   Yisong Yue, et al.

Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.


page 1

page 2

page 3

page 4


Toward Building Conversational Recommender Systems: A Contextual Bandit Approach

Contextual bandit algorithms have gained increasing popularity in recomm...

Two-Stage Neural Contextual Bandits for Personalised News Recommendation

We consider the problem of personalised news recommendation where each u...

Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

User interest exploration is an important and challenging topic in recom...

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

Stochastic linear contextual bandit algorithms have substantial applicat...

Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection

We consider the stochastic contextual bandit problem under the high dime...

Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms

Contextual bandit algorithms have become popular for online recommendati...