Accelerated learning from recommender systems using multi-armed bandit

08/16/2019
by   Meisam Hejazinia, et al.
0

Recommendation systems are a vital component of many online marketplaces, where there are often millions of items to potentially present to users who have a wide variety of wants or needs. Evaluating recommender system algorithms is a hard task, given all the inherent bias in the data, and successful companies must be able to rapidly iterate on their solution to maintain their competitive advantage. The gold standard for evaluating recommendation algorithms has been the A/B test since it is an unbiased way to estimate how well one or more algorithms compare in the real world. However, there are a number of issues with A/B testing that make it impractical to be the sole method of testing, including long lead time, and high cost of exploration. We argue that multi armed bandit (MAB) testing as a solution to these issues. We showcase how we implemented a MAB solution as an extra step between offline and online A/B testing in a production system. We present the result of our experiment and compare all the offline, MAB, and online A/B tests metrics for our use case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

The Use of Bandit Algorithms in Intelligent Interactive Recommender Systems

In today's business marketplace, many high-tech Internet enterprises con...
research
06/21/2021

BanditMF: Multi-Armed Bandit Based Matrix Factorization Recommender System

Multi-armed bandits (MAB) provide a principled online learning approach ...
research
10/23/2021

Towards the D-Optimal Online Experiment Design for Recommender Selection

Selecting the optimal recommender via online exploration-exploitation is...
research
09/21/2020

Bandits Under The Influence (Extended Version)

Recommender systems should adapt to user interests as the latter evolve....
research
09/06/2022

A Scalable Recommendation Engine for New Users and Items

In many digital contexts such as online news and e-tailing with many new...
research
09/11/2021

Existence conditions for hidden feedback loops in online recommender systems

We explore a hidden feedback loops effect in online recommender systems....
research
08/17/2019

A Batched Multi-Armed Bandit Approach to News Headline Testing

Optimizing news headlines is important for publishers and media sites. A...

Please sign up or login with your details

Forgot password? Click here to reset