Taking the Counterfactual Online: Efficient and Unbiased Online Evaluation for Ranking

07/24/2020
by   Harrie Oosterhuis, et al.
0

Counterfactual evaluation can estimate Click-Through-Rate (CTR) differences between ranking systems based on historical interaction data, while mitigating the effect of position bias and item-selection bias. We introduce the novel Logging-Policy Optimization Algorithm (LogOpt), which optimizes the policy for logging data so that the counterfactual estimate has minimal variance. As minimizing variance leads to faster convergence, LogOpt increases the data-efficiency of counterfactual estimation. LogOpt turns the counterfactual approach - which is indifferent to the logging policy - into an online approach, where the algorithm decides what rankings to display. We prove that, as an online evaluation method, LogOpt is unbiased w.r.t. position and item-selection bias, unlike existing interleaving methods. Furthermore, we perform large-scale experiments by simulating comparisons between thousands of rankers. Our results show that while interleaving methods make systematic errors, LogOpt is as efficient as interleaving without being biased.

READ FULL TEXT
research
05/18/2020

Policy-Aware Unbiased Learning to Rank for Top-k Rankings

Counterfactual Learning to Rank (LTR) methods optimize ranking systems u...
research
07/25/2020

Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions

Users of music streaming, video streaming, news recommendation, and e-co...
research
12/08/2020

Unifying Online and Counterfactual Learning to Rank

Optimizing ranking systems based on user interactions is a well-studied ...
research
05/21/2020

Accelerated Convergence for Counterfactual Learning to Rank

Counterfactual Learning to Rank (LTR) algorithms learn a ranking model f...
research
04/26/2023

Safe Deployment for Counterfactual Learning to Rank with Exposure-Based Risk Minimization

Counterfactual learning to rank (CLTR) relies on exposure-based inverse ...
research
12/28/2021

Adversarial Learning for Incentive Optimization in Mobile Payment Marketing

Many payment platforms hold large-scale marketing campaigns, which alloc...
research
07/08/2020

Unbiased Lift-based Bidding System

Conventional bidding strategies for online display ad auction heavily re...

Please sign up or login with your details

Forgot password? Click here to reset