Mixture-Based Correction for Position and Trust Bias in Counterfactual Learning to Rank
In counterfactual learning to rank (CLTR) user interactions are used as a source of supervision. Since user interactions come with bias, an important focus of research in this field lies in developing methods to correct for the bias of interactions. Inverse propensity scoring (IPS) is a popular method suitable for correcting position bias. Affine correction (AC) is a generalization of IPS that corrects for position bias and trust bias. IPS and AC provably remove bias, conditioned on an accurate estimation of the bias parameters. Estimating the bias parameters, in turn, requires an accurate estimation of the relevance probabilities. This cyclic dependency introduces practical limitations in terms of sensitivity, convergence and efficiency. We propose a new correction method for position and trust bias in CLTR in which, unlike the existing methods, the correction does not rely on relevance estimation. Our proposed method, mixture-based correction (MBC), is based on the assumption that the distribution of the CTRs over the items being ranked is a mixture of two distributions: the distribution of CTRs for relevant items and the distribution of CTRs for non-relevant items. We prove that our method is unbiased. The validity of our proof is not conditioned on accurate bias parameter estimation. Our experiments show that MBC, when used in different bias settings and accompanied by different LTR algorithms, outperforms AC, the state-of-the-art method for correcting position and trust bias, in some settings, while performing on par in other settings. Furthermore, MBC is orders of magnitude more efficient than AC in terms of the training time.
READ FULL TEXT