Unconfounded Propensity Estimation for Unbiased Ranking

05/17/2023
by   Dan Luo, et al.
0

The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems. Among existing solutions, automatic ULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their theoretical soundness, the effectiveness is usually justified under a weak logging policy, where the ranking model can barely rank documents according to their relevance to the query. However, when the logging policy is strong, e.g., an industry-deployed ranking policy, the reported effectiveness cannot be reproduced. In this paper, we first investigate ULTR from a causal perspective and uncover a negative result: existing ULTR algorithms fail to address the issue of propensity overestimation caused by the query-document relevance confounder. Then, we propose a new learning objective based on backdoor adjustment and highlight its differences from conventional propensity models, which reveal the prevalence of propensity overestimation. On top of that, we introduce a novel propensity model called Logging-Policy-aware Propensity (LPP) model and its distinctive two-step optimization strategy, which allows for the joint learning of LPP and ranking models within the automatic ULTR framework, and actualize the unconfounded propensity estimation for ULTR. Extensive experiments on two benchmarks demonstrate the effectiveness and generalizability of the proposed method.

READ FULL TEXT
research
08/20/2020

Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank

Leveraging biased click data for optimizing learning to rank systems has...
research
04/16/2018

Unbiased Learning to Rank with Unbiased Propensity Estimation

Learning to rank with biased click data is a well-known challenge. A var...
research
05/18/2020

Unbiased Learning to Rank via Propensity Ratio Scoring

Implicit feedback, such as user clicks, is a major source of supervision...
research
06/03/2022

Scalar is Not Enough: Vectorization-based Unbiased Learning to Rank

Unbiased learning to rank (ULTR) aims to train an unbiased ranking model...
research
05/26/2023

Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes Approach

Ranking is at the core of many artificial intelligence (AI) applications...
research
05/03/2021

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Recent work has proposed stochastic Plackett-Luce (PL) ranking models as...
research
05/07/2018

Ranking for Relevance and Display Preferences in Complex Presentation Layouts

Learning to Rank has traditionally considered settings where given the r...

Please sign up or login with your details

Forgot password? Click here to reset