On the Difficulty of Evaluating Baselines: A Study on Recommender Systems

05/04/2019
by   Steffen Rendle, et al.
0

Numerical evaluations with comparisons to baselines play a central role when judging research in recommender systems. In this paper, we show that running baselines properly is difficult. We demonstrate this issue on two extensively studied datasets. First, we show that results for baselines that have been used in numerous publications over the past five years for the Movielens 10M benchmark are suboptimal. With a careful setup of a vanilla matrix factorization baseline, we are not only able to improve upon the reported results for this baseline but even outperform the reported results of any newly proposed method. Secondly, we recap the tremendous effort that was required by the community to obtain high quality results for simple methods on the Netflix Prize. Our results indicate that empirical findings in research papers are questionable unless they were obtained on standardized benchmarks where baselines have been tuned extensively by the research community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2022

Lib-SibGMU – A University Library Circulation Dataset for Recommender Systems Developmen

We opensource under CC BY 4.0 license Lib-SibGMU - a university library ...
research
03/11/2023

PowerMat: context-aware recommender system without user item rating values that solves the cold-start problem

Recommender systems serves as an important technical asset in many moder...
research
11/18/2019

A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research

The design of algorithms that generate personalized ranked item lists is...
research
02/15/2021

UserReg: A Simple but Strong Model for Rating Prediction

Collaborative filtering (CF) has achieved great success in the field of ...
research
03/02/2022

Top-N Recommendation Algorithms: A Quest for the State-of-the-Art

Research on recommender systems algorithms, like other areas of applied ...
research
09/12/2019

How robust is MovieLens? A dataset analysis for recommender systems

Research publication requires public datasets. In recommender systems, s...
research
11/02/2022

Where Do We Go From Here? Guidelines For Offline Recommender Evaluation

Various studies in recent years have pointed out large issues in the off...

Please sign up or login with your details

Forgot password? Click here to reset