-
An empirical study of public data quality problems in cross project defect prediction
Background: Two public defect data, including Jureczko and NASA datasets...
read it
-
How robust is MovieLens? A dataset analysis for recommender systems
Research publication requires public datasets. In recommender systems, s...
read it
-
An Open Source AutoML Benchmark
In recent years, an active field of research has developed around automa...
read it
-
User Response Prediction in Online Advertising
Online advertising, as the vast market, has gained significant attention...
read it
-
AutoRec: An Automated Recommender System
Realistic recommender systems are often required to adapt to ever-changi...
read it
-
Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning
Modern online advertising systems inevitably rely on personalization met...
read it
-
Online Model Evaluation in a Large-Scale Computational Advertising Platform
Online media provides opportunities for marketers through which they can...
read it
FuxiCTR: An Open Benchmark for Click-Through Rate Prediction
In many applications, such as recommender systems, online advertising, and product search, click-through rate (CTR) prediction is a critical task, because its accuracy has a direct impact on both platform revenue and user experience. In recent years, with the prevalence of deep learning, CTR prediction has been widely studied in both academia and industry, resulting in an abundance of deep CTR models. Unfortunately, there is still a lack of a standardized benchmark and uniform evaluation protocols for CTR prediction. This leads to the non-reproducible and even inconsistent experimental results among these studies. In this paper, we present an open benchmark (namely FuxiCTR) for reproducible research and provide a rigorous comparison of different models for CTR prediction. Specifically, we ran over 4,600 experiments for a total of more than 12,000 GPU hours in a uniform framework to re-evaluate 24 existing models on two widely-used datasets, Criteo and Avazu. Surprisingly, our experiments show that many models have smaller differences than expected and sometimes are even inconsistent with what reported in the literature. We believe that our benchmark could not only allow researchers to gauge the effectiveness of new models conveniently, but also share some good practices to fairly compare with the state of the arts. We will release all the code and benchmark settings.
READ FULL TEXT
Comments
There are no comments yet.