Efficient Online Learning of Optimal Rankings: Dimensionality Reduction via Gradient Descent
We consider a natural model of online preference aggregation, where sets of preferred items R_1, R_2, …, R_t along with a demand for k_t items in each R_t, appear online. Without prior knowledge of (R_t, k_t), the learner maintains a ranking π_t aiming that at least k_t items from R_t appear high in π_t. This is a fundamental problem in preference aggregation with applications to, e.g., ordering product or news items in web pages based on user scrolling and click patterns. The widely studied Generalized Min-Sum-Set-Cover (GMSSC) problem serves as a formal model for the setting above. GMSSC is NP-hard and the standard application of no-regret online learning algorithms is computationally inefficient, because they operate in the space of rankings. In this work, we show how to achieve low regret for GMSSC in polynomial-time. We employ dimensionality reduction from rankings to the space of doubly stochastic matrices, where we apply Online Gradient Descent. A key step is to show how subgradients can be computed efficiently, by solving the dual of a configuration LP. Using oblivious deterministic and randomized rounding schemes, we map doubly stochastic matrices back to rankings with a small loss in the GMSSC objective.
READ FULL TEXT