Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

08/25/2023
by   Tianchi Cai, et al.
0

Model-free RL-based recommender systems have recently received increasing research attention due to their capability to handle partial feedback and long-term rewards. However, most existing research has ignored a critical feature in recommender systems: one user's feedback on the same item at different times is random. The stochastic rewards property essentially differs from that in classic RL scenarios with deterministic rewards, which makes RL-based recommender systems much more challenging. In this paper, we first demonstrate in a simulator environment where using direct stochastic feedback results in a significant drop in performance. Then to handle the stochastic feedback more efficiently, we design two stochastic reward stabilization frameworks that replace the direct stochastic feedback with that learned by a supervised model. Both frameworks are model-agnostic, i.e., they can effectively utilize various supervised models. We demonstrate the superiority of the proposed frameworks over different RL-based recommendation baselines with extensive experiments on a recommendation simulator as well as an industrial-level recommender system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems

Interactive Recommender Systems (IRSs) have attracted a lot of attention...
research
11/06/2019

MBCAL: A Simple and Efficient Reinforcement Learning Method for Recommendation Systems

It has been widely regarded that only considering the immediate user fee...
research
06/10/2020

Self-Supervised Reinforcement Learning for Recommender Systems

In session-based or sequential recommendation, it is important to consid...
research
09/22/2021

A Survey on Reinforcement Learning for Recommender Systems

Recommender systems have been widely applied in different real-life scen...
research
02/11/2022

Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems

In the long term, reinforcement learning (RL) is considered by many AI t...
research
05/04/2020

Reward Constrained Interactive Recommendation with Natural Language Feedback

Text-based interactive recommendation provides richer user feedback and ...
research
05/30/2023

Robust Reinforcement Learning Objectives for Sequential Recommender Systems

Attention-based sequential recommendation methods have demonstrated prom...

Please sign up or login with your details

Forgot password? Click here to reset