Beyond the Click-Through Rate: Web Link Selection with Multi-level Feedback

05/04/2018
by   Kun Chen, et al.
0

The web link selection problem is to select a small subset of web links from a large web link pool, and to place the selected links on a web page that can only accommodate a limited number of links, e.g., advertisements, recommendations, or news feeds. Despite the long concerned click-through rate which reflects the attractiveness of the link itself, the revenue can only be obtained from user actions after clicks, e.g., purchasing after being directed to the product pages by recommendation links. Thus, the web links have an intrinsic multi-level feedback structure. With this observation, we consider the context-free web link selection problem, where the objective is to maximize revenue while ensuring that the attractiveness is no less than a preset threshold. The key challenge of the problem is that each link's multi-level feedbacks are stochastic, and unobservable unless the link is selected. We model this problem with a constrained stochastic multi-armed bandit formulation, and design an efficient link selection algorithm, called Constrained Upper Confidence Bound algorithm (Con-UCB), and prove O(√(T T)) bounds on both the regret and the violation of the attractiveness constraint. We conduct extensive experiments on three real-world datasets, and show that Con-UCB outperforms state-of-the-art context-free bandit algorithms concerning the multi-level feedback structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2019

Constrained Thompson Sampling for Wireless Link Optimization

Wireless communication systems operate in complex time-varying environme...
research
05/23/2019

Graph regret bounds for Thompson Sampling and UCB

We study the stochastic multi-armed bandit problem with the graph-based ...
research
01/31/2019

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model

Contextual multi-armed bandit (MAB) algorithms have been shown promising...
research
05/12/2018

Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

In this paper we consider the dynamic assortment selection problem under...
research
02/13/2016

Conservative Bandits

We study a novel multi-armed bandit problem that models the challenge fa...
research
01/23/2019

Cooperation Speeds Surfing: Use Co-Bandit!

In this paper, we explore the benefit of cooperation in adversarial band...

Please sign up or login with your details

Forgot password? Click here to reset