Comparison-based Conversational Recommender System with Relative Bandit Feedback

by   Zhihui Xie, et al.
Shandong University
Shanghai Jiao Tong University

With the recent advances of conversational recommendations, the recommender system is able to actively and dynamically elicit user preference via conversational interactions. To achieve this, the system periodically queries users' preference on attributes and collects their feedback. However, most existing conversational recommender systems only enable the user to provide absolute feedback to the attributes. In practice, the absolute feedback is usually limited, as the users tend to provide biased feedback when expressing the preference. Instead, the user is often more inclined to express comparative preferences, since user preferences are inherently relative. To enable users to provide comparative preferences during conversational interactions, we propose a novel comparison-based conversational recommender system. The relative feedback, though more practical, is not easy to be incorporated since its feedback scale is always mismatched with users' absolute preferences. With effectively collecting and understanding the relative feedback from an interactive manner, we further propose a new bandit algorithm, which we call RelativeConUCB. The experiments on both synthetic and real-world datasets validate the advantage of our proposed method, compared to the existing bandit algorithms in the conversational recommender systems.


page 1

page 2

page 3

page 4


Hierarchical Conversational Preference Elicitation with Bandit Feedback

The recent advances of conversational recommendations provide a promisin...

Toward Building Conversational Recommender Systems: A Contextual Bandit Approach

Contextual bandit algorithms have gained increasing popularity in recomm...

Efficient Explorative Key-term Selection Strategies for Conversational Contextual Bandits

Conversational contextual bandits elicit user preferences by occasionall...

A Bayesian Choice Model for Eliminating Feedback Loops

Self-reinforcing feedback loops in personalization systems are typically...

Reward Constrained Interactive Recommendation with Natural Language Feedback

Text-based interactive recommendation provides richer user feedback and ...

Meta Policy Learning for Cold-Start Conversational Recommendation

Conversational recommender systems (CRS) explicitly solicit users' prefe...

Please sign up or login with your details

Forgot password? Click here to reset