Widespread Flaws in Offline Evaluation of Recommender Systems

07/27/2023
by   Balázs Hidasi, et al.
0

Even though offline evaluation is just an imperfect proxy of online performance – due to the interactive nature of recommenders – it will probably remain the primary way of evaluation in recommender systems research for the foreseeable future, since the proprietary nature of production recommenders prevents independent validation of A/B test setups and verification of online results. Therefore, it is imperative that offline evaluation setups are as realistic and as flawless as they can be. Unfortunately, evaluation flaws are quite common in recommender systems research nowadays, due to later works copying flawed evaluation setups from their predecessors without questioning their validity. In the hope of improving the quality of offline evaluation of recommender systems, we discuss four of these widespread flaws and why researchers should avoid them.

READ FULL TEXT
research
08/22/2023

On the Opportunities and Challenges of Offline Reinforcement Learning for Recommender Systems

Reinforcement learning serves as a potent tool for modeling dynamic user...
research
09/19/2020

Modeling Online Behavior in Recommender Systems: The Importance of Temporal Context

Simulating online recommender system performance is notoriously difficul...
research
11/02/2022

Where Do We Go From Here? Guidelines For Offline Recommender Evaluation

Various studies in recent years have pointed out large issues in the off...
research
07/15/2021

Online Learning for Recommendations at Grubhub

We propose a method to easily modify existing offline Recommender System...
research
10/11/2018

A Distributed and Accountable Approach to Offline Recommender Systems Evaluation

Different software tools have been developed with the purpose of perform...
research
10/21/2020

On Offline Evaluation of Recommender Systems

In academic research, recommender models are often evaluated offline on ...
research
09/10/2018

The LKPY Package for Recommender Systems Experiments: Next-Generation Tools and Lessons Learned from the LensKit Project

Since 2010, we have built and maintained LensKit, an open-source toolkit...

Please sign up or login with your details

Forgot password? Click here to reset