Toward Building Conversational Recommender Systems: A Contextual Bandit Approach

06/04/2019
by   Xiaoying Zhang, et al.
0

Contextual bandit algorithms have gained increasing popularity in recommender systems, because they can learn to adapt recommendations by making exploration-exploitation trade-off. Recommender systems equipped with traditional contextual bandit algorithms are usually trained with behavioral feedback (e.g., clicks) from users on items. The learning speed can be slow because behavioral feedback by nature does not carry sufficient information. As a result, extensive exploration has to be performed. To address the problem, we propose conversational recommendation in which the system occasionally asks questions to the user about her interest. We first generalize contextual bandit to leverage not only behavioral feedback (arm-level feedback), but also verbal feedback (users' interest on categories, topics, etc.). We then propose a new UCB- based algorithm, and theoretically prove that the new algorithm can indeed reduce the amount of exploration in learning. We also design several strategies for asking questions to further optimize the speed of learning. Experiments on synthetic data, Yelp data, and news recommendation data from Toutiao demonstrate the efficacy of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2022

Comparison-based Conversational Recommender System with Relative Bandit Feedback

With the recent advances of conversational recommendations, the recommen...
research
06/27/2012

Hierarchical Exploration for Accelerating Contextual Bandits

Contextual bandit learning is an increasingly popular approach to optimi...
research
07/16/2020

Fast Distributed Bandits for Online Recommendation Systems

Contextual bandit algorithms are commonly used in recommender systems, w...
research
12/06/2018

Top-K Off-Policy Correction for a REINFORCE Recommender System

Industrial recommender systems deal with extremely large action spaces -...
research
06/26/2023

Scalable Neural Contextual Bandit for Recommender Systems

High-quality recommender systems ought to deliver both innovative and re...
research
04/05/2023

Optimism Based Exploration in Large-Scale Recommender Systems

Bandit learning algorithms have been an increasingly popular design choi...
research
08/18/2020

Fast Approximate Bayesian Contextual Cold Start Learning (FAB-COST)

Cold-start is a notoriously difficult problem which can occur in recomme...

Please sign up or login with your details

Forgot password? Click here to reset