Calibrating Explore-Exploit Trade-off for Fair Online Learning to Rank

11/01/2021
by   Yiling Jia, et al.
0

Online learning to rank (OL2R) has attracted great research interests in recent years, thanks to its advantages in avoiding expensive relevance labeling as required in offline supervised ranking model learning. Such a solution explores the unknowns (e.g., intentionally present selected results on top positions) to improve its relevance estimation. This however triggers concerns on its ranking fairness: different groups of items might receive differential treatments during the course of OL2R. But existing fair ranking solutions usually require the knowledge of result relevance or a performing ranker beforehand, which contradicts with the setting of OL2R and thus cannot be directly applied to guarantee fairness. In this work, we propose a general framework to achieve fairness defined by group exposure in OL2R. The key idea is to calibrate exploration and exploitation for fairness control, relevance learning and online ranking quality. In particular, when the model is exploring a set of results for relevance feedback, we confine the exploration within a subset of random permutations, where fairness across groups is maintained while the feedback is still unbiased. Theoretically we prove such a strategy introduces minimum distortion in OL2R's regret to obtain fairness. Extensive empirical analysis is performed on two public learning to rank benchmark datasets to demonstrate the effectiveness of the proposed solution compared to existing fair OL2R solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2022

Learning Neural Ranking Models Online from Implicit User Feedback

Existing online learning to rank (OL2R) solutions are limited to linear ...
research
02/11/2021

Fairness Through Regularization for Learning to Rank

Given the abundance of applications of ranking in recent years, addressi...
research
02/28/2021

PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer

Online Learning to Rank (OL2R) eliminates the need of explicit relevance...
research
06/13/2022

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

Deep neural networks (DNNs) demonstrate significant advantages in improv...
research
03/19/2021

Individually Fair Ranking

We develop an algorithm to train individually fair learning-to-rank (LTR...
research
06/11/2020

Group-Fair Online Allocation in Continuous Time

The theory of discrete-time online learning has been successfully applie...
research
05/03/2021

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Recent work has proposed stochastic Plackett-Luce (PL) ranking models as...

Please sign up or login with your details

Forgot password? Click here to reset