Doubly-Robust Estimation for Unbiased Learning-to-Rank from Position-Biased Click Feedback

03/31/2022
by   Harrie Oosterhuis, et al.
0

Clicks on rankings suffer from position bias: generally items on lower ranks are less likely to be examined - and thus clicked - by users, in spite of their actual preferences between items. The prevalent approach to unbiased click-based Learning-to-Rank (LTR) is based on counterfactual Inverse-Propensity-Scoring (IPS) estimation. Unique about LTR is the fact that standard Doubly-Robust (DR) estimation - which combines IPS with regression predictions - is inapplicable since the treatment variable - indicating whether a user examined an item - cannot be observed in the data. In this paper, we introduce a novel DR estimator that uses the expectation of treatment per rank instead. Our novel DR estimator has more robust unbiasedness conditions than the existing IPS approach, and in addition, provides enormous decreases in variance: our experimental results indicate it requires several orders of magnitude fewer datapoints to converge at optimal performance. For the unbiased LTR field, our DR estimator contributes both increases in state-of-the-art performance and the most robust theoretical guarantees of all known LTR estimators.

READ FULL TEXT
research
08/24/2020

When Inverse Propensity Scoring does not Work: Affine Corrections for Unbiased Learning to Rank

Besides position bias, which has been well-studied, trust bias is anothe...
research
05/18/2020

Policy-Aware Unbiased Learning to Rank for Top-k Rankings

Counterfactual Learning to Rank (LTR) methods optimize ranking systems u...
research
08/02/2021

The Bias-Variance Tradeoff of Doubly Robust Estimator with Targeted L_1 regularized Neural Networks Predictions

The Doubly Robust (DR) estimation of ATE can be carried out in 2 steps, ...
research
12/18/2021

Off-Policy Evaluation Using Information Borrowing and Context-Based Switching

We consider the off-policy evaluation (OPE) problem in contextual bandit...
research
06/24/2022

Reaching the End of Unbiasedness: Uncovering Implicit Limitations of Click-Based Learning to Rank

Click-based learning to rank (LTR) tackles the mismatch between click fr...
research
05/28/2021

Enhanced Doubly Robust Learning for Debiasing Post-click Conversion Rate Estimation

Post-click conversion, as a strong signal indicating the user preference...
research
04/22/2022

Learning-to-Rank at the Speed of Sampling: Plackett-Luce Gradient Estimation With Minimal Computational Complexity

Plackett-Luce gradient estimation enables the optimization of stochastic...

Please sign up or login with your details

Forgot password? Click here to reset