Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective

02/07/2023
by   Lucas Maystre, et al.
0

We study the problem of optimizing a recommender system for outcomes that occur over several weeks or months. We begin by drawing on reinforcement learning to formulate a comprehensive model of users' recurring relationships with a recommender system. Measurement, attribution, and coordination challenges complicate algorithm design. We describe careful modeling – including a new representation of user state and key conditional independence assumptions – which overcomes these challenges and leads to simple, testable recommender system prototypes. We apply our approach to a podcast recommender system that makes personalized recommendations to hundreds of millions of listeners. A/B tests demonstrate that purposefully optimizing for long-term outcomes leads to large performance gains over conventional approaches that optimize for short-term proxies.

READ FULL TEXT

page 8

page 16

research
05/23/2023

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Auction-based recommender systems are prevalent in online advertising pl...
research
05/29/2019

Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology

Most practical recommender systems focus on estimating immediate user en...
research
09/15/2020

Reinforcement Learning for Strategic Recommendations

Strategic recommendations (SR) refer to the problem where an intelligent...
research
07/19/2023

Impatient Bandits: Optimizing Recommendations for the Long-Term Without Delay

Recommender systems are a ubiquitous feature of online platforms. Increa...
research
05/26/2022

Constrained Reinforcement Learning for Short Video Recommendation

The wide popularity of short videos on social media poses new opportunit...
research
10/28/2022

Continuous Attribution of Episodical Outcomes for More Efficient and Targeted Online Measurement

Online experimentation platforms collect user feedback at low cost and l...

Please sign up or login with your details

Forgot password? Click here to reset