Towards the D-Optimal Online Experiment Design for Recommender Selection

10/23/2021
by   Da Xu, et al.
0

Selecting the optimal recommender via online exploration-exploitation is catching increasing attention where the traditional A/B testing can be slow and costly, and offline evaluations are prone to the bias of history data. Finding the optimal online experiment is nontrivial since both the users and displayed recommendations carry contextual features that are informative to the reward. While the problem can be formalized via the lens of multi-armed bandits, the existing solutions are found less satisfactorily because the general methodologies do not account for the case-specific structures, particularly for the e-commerce recommendation we study. To fill in the gap, we leverage the D-optimal design from the classical statistics literature to achieve the maximum information gain during exploration, and reveal how it fits seamlessly with the modern infrastructure of online inference. To demonstrate the effectiveness of the optimal designs, we provide semi-synthetic simulation studies with published code and data for reproducibility purposes. We then use our deployment example on Walmart.com to fully illustrate the practical insights and effectiveness of the proposed methods.

READ FULL TEXT
research
05/10/2013

Exponentiated Gradient LINUCB for Contextual Multi-Armed Bandits

We present Exponentiated Gradient LINUCB, an algorithm for con-textual m...
research
08/16/2019

Accelerated learning from recommender systems using multi-armed bandit

Recommendation systems are a vital component of many online marketplaces...
research
08/03/2020

Deep Bayesian Bandits: Exploring in Online Personalized Recommendations

Recommender systems trained in a continuous learning fashion are plagued...
research
04/16/2023

A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits

Personalized recommender systems suffuse modern life, shaping what media...
research
08/31/2021

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

We consider the query recommendation problem in closed loop interactive ...

Please sign up or login with your details

Forgot password? Click here to reset