Scalable Neural Contextual Bandit for Recommender Systems

06/26/2023
by   Zheqing Zhu, et al.
0

High-quality recommender systems ought to deliver both innovative and relevant content through effective and exploratory interactions with users. Yet, supervised learning-based neural networks, which form the backbone of many existing recommender systems, only leverage recognized user interests, falling short when it comes to efficiently uncovering unknown user preferences. While there has been some progress with neural contextual bandit algorithms towards enabling online exploration through neural networks, their onerous computational demands hinder widespread adoption in real-world recommender systems. In this work, we propose a scalable sample-efficient neural contextual bandit algorithm for recommender systems. To do this, we design an epistemic neural network architecture, Epistemic Neural Recommendation (ENR), that enables Thompson sampling at a large scale. In two distinct large-scale experiments with real-world tasks, ENR significantly boosts click-through rates and user ratings by at least 9 state-of-the-art neural contextual bandit algorithms. Furthermore, it achieves equivalent performance with at least 29 the best-performing baseline algorithm. Remarkably, while accomplishing these improvements, ENR demands orders of magnitude fewer computational resources than neural contextual bandit baseline algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2023

Optimism Based Exploration in Large-Scale Recommender Systems

Bandit learning algorithms have been an increasingly popular design choi...
research
06/04/2019

Toward Building Conversational Recommender Systems: A Contextual Bandit Approach

Contextual bandit algorithms have gained increasing popularity in recomm...
research
04/13/2023

PIE: Personalized Interest Exploration for Large-Scale Recommender Systems

Recommender systems are increasingly successful in recommending personal...
research
06/27/2012

Hierarchical Exploration for Accelerating Contextual Bandits

Contextual bandit learning is an increasingly popular approach to optimi...
research
09/10/2019

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems

We present Distributed Equivalent Substitution (DES) training, a novel d...
research
09/28/2020

Position-Based Multiple-Play Bandits with Thompson Sampling

Multiple-play bandits aim at displaying relevant items at relevant posit...
research
07/16/2020

Fast Distributed Bandits for Online Recommendation Systems

Contextual bandit algorithms are commonly used in recommender systems, w...

Please sign up or login with your details

Forgot password? Click here to reset