DeepAI AI Chat
Log In Sign Up

Exploring Data Splitting Strategies for the Evaluation of Recommendation Models

by   Zaiqiao Meng, et al.

Effective methodologies for evaluating recommender systems are critical, so that such systems can be compared in a sound manner. A commonly overlooked aspect of recommender system evaluation is the selection of the data splitting strategy. In this paper, we both show that there is no standard splitting strategy and that the selection of splitting strategy can have a strong impact on the ranking of recommender systems. In particular, we perform experiments comparing three common splitting strategies, examining their impact over seven state-of-the-art recommendation models for two datasets. Our results demonstrate that the splitting strategy employed is an important confounding variable that can markedly alter the ranking of state-of-the-art systems, making much of the currently published literature non-comparable, even when the same dataset and metrics are used.


page 1

page 2

page 3

page 4


Elliot: a Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation

Recommender Systems have shown to be an effective way to alleviate the o...

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems

We present Distributed Equivalent Substitution (DES) training, a novel d...

From Intrinsic to Counterfactual: On the Explainability of Contextualized Recommender Systems

With the prevalence of deep learning based embedding approaches, recomme...

Improving Recommendation System Serendipity Through Lexicase Selection

Recommender systems influence almost every aspect of our digital lives. ...

A Case Study on Sampling Strategies for Evaluating Neural Sequential Item Recommendation Models

At the present time, sequential item recommendation models are compared ...

SVP-CF: Selection via Proxy for Collaborative Filtering Data

We study the practical consequences of dataset sampling strategies on th...

On Sampling Collaborative Filtering Datasets

We study the practical consequences of dataset sampling strategies on th...