Contextual User Browsing Bandits for Large-Scale Online Mobile Recommendation

08/21/2020
by   Xu He, et al.
0

Online recommendation services recommend multiple commodities to users. Nowadays, a considerable proportion of users visit e-commerce platforms by mobile devices. Due to the limited screen size of mobile devices, positions of items have a significant influence on clicks: 1) Higher positions lead to more clicks for one commodity. 2) The 'pseudo-exposure' issue: Only a few recommended items are shown at first glance and users need to slide the screen to browse other items. Therefore, some recommended items ranked behind are not viewed by users and it is not proper to treat this kind of items as negative samples. While many works model the online recommendation as contextual bandit problems, they rarely take the influence of positions into consideration and thus the estimation of the reward function may be biased. In this paper, we aim at addressing these two issues to improve the performance of online mobile recommendation. Our contributions are four-fold. First, since we concern the reward of a set of recommended items, we model the online recommendation as a contextual combinatorial bandit problem and define the reward of a recommended set. Second, we propose a novel contextual combinatorial bandit method called UBM-LinUCB to address two issues related to positions by adopting the User Browsing Model (UBM), a click model for web search. Third, we provide a formal regret analysis and prove that our algorithm achieves sublinear regret independent of the number of items. Finally, we evaluate our algorithm on two real-world datasets by a novel unbiased estimator. An online experiment is also implemented in Taobao, one of the most popular e-commerce platforms in the world. Results on two CTR metrics show that our algorithm outperforms the other contextual bandit algorithms.

READ FULL TEXT

page 7

page 9

research
09/04/2022

Exposure-Aware Recommendation using Contextual Bandits

Exposure bias is a well-known issue in recommender systems where items a...
research
01/23/2019

Thompson Sampling for a Fatigue-aware Online Recommendation System

In this paper we consider an online recommendation setting, where a plat...
research
01/22/2020

Incentivising Exploration and Recommendations for Contextual Bandits with Payments

We propose a contextual bandit based model to capture the learning and s...
research
07/16/2020

Fast Distributed Bandits for Online Recommendation Systems

Contextual bandit algorithms are commonly used in recommender systems, w...
research
08/07/2023

Mobile Supply: The Last Piece of Jigsaw of Recommender System

Recommendation system is a fundamental functionality of online platforms...
research
09/14/2020

Carousel Personalization in Music Streaming Apps with Contextual Bandits

Media services providers, such as music streaming platforms, frequently ...
research
04/28/2020

A Linear Bandit for Seasonal Environments

Contextual bandit algorithms are extremely popular and widely used in re...

Please sign up or login with your details

Forgot password? Click here to reset