Replication Markets: Results, Lessons, Challenges and Opportunities in AI Replication

by   Yang Liu, et al.

The last decade saw the emergence of systematic large-scale replication projects in the social and behavioral sciences, (Camerer et al., 2016, 2018; Ebersole et al., 2016; Klein et al., 2014, 2018; Collaboration, 2015). These projects were driven by theoretical and conceptual concerns about a high fraction of "false positives" in the scientific publications (Ioannidis, 2005) (and a high prevalence of "questionable research practices" (Simmons, Nelson, and Simonsohn, 2011). Concerns about the credibility of research findings are not unique to the behavioral and social sciences; within Computer Science, Artificial Intelligence (AI) and Machine Learning (ML) are areas of particular concern (Lucic et al., 2018; Freire, Bonnet, and Shasha, 2012; Gundersen and Kjensmo, 2018; Henderson et al., 2018). Given the pioneering role of the behavioral and social sciences in the promotion of novel methodologies to improve the credibility of research, it is a promising approach to analyze the lessons learned from this field and adjust strategies for Computer Science, AI and ML In this paper, we review approaches used in the behavioral and social sciences and in the DARPA SCORE project. We particularly focus on the role of human forecasting of replication outcomes, and how forecasting can leverage the information gained from relatively labor and resource-intensive replications. We will discuss opportunities and challenges of using these approaches to monitor and improve the credibility of research areas in Computer Science, AI, and ML.


page 1

page 2

page 3

page 4

page 5

page 6

page 7


Serious Games and AI: Challenges and Opportunities for Computational Social Science

The video game industry plays an essential role in the entertainment sph...

Mathematical Foundations for Social Computing

Social computing encompasses the mechanisms through which people interac...

The Perils of Advocacy

Statisticians and data scientists find insights that help lead to better...

Predicting replicability – analysis of survey and prediction market data from large-scale forecasting projects

The reproducibility of published research has become an important topic ...

A Synthetic Prediction Market for Estimating Confidence in Published Work

Explainably estimating confidence in published scholarly work offers opp...

The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning

Recent concerns that machine learning (ML) may be facing a reproducibili...

Why "Redefining Statistical Significance" Will Not Improve Reproducibility and Could Make the Replication Crisis Worse

A recent proposal to "redefine statistical significance" (Benjamin, et a...

Please sign up or login with your details

Forgot password? Click here to reset