NIPS - Not Even Wrong? A Systematic Review of Empirically Complete Demonstrations of Algorithmic Effectiveness in the Machine Learning and Artificial Intelligence Literature

12/18/2018
by   Franz J Király, et al.
6

Objective: To determine the completeness of argumentative steps necessary to conclude effectiveness of an algorithm in a sample of current ML/AI supervised learning literature. Data Sources: Papers published in the Neural Information Processing Systems (NeurIPS, née NIPS) journal where the official record showed a 2017 year of publication. Eligibility Criteria: Studies reporting a (semi-)supervised model, or pre-processing fused with (semi-)supervised models for tabular data. Study Appraisal: Three reviewers applied the assessment criteria to determine argumentative completeness. The criteria were split into three groups, including: experiments (e.g real and/or synthetic data), baselines (e.g uninformed and/or state-of-art) and quantitative comparison (e.g. performance quantifiers with confidence intervals and formal comparison of the algorithm against baselines). Results: Of the 121 eligible manuscripts (from the sample of 679 abstracts), 99% used real-world data and 29% used synthetic data. 91% of manuscripts did not report an uninformed baseline and 55% reported a state-of-art baseline. 32% reported confidence intervals for performance but none provided references or exposition for how these were calculated. 3% reported formal comparisons. Limitations: The use of one journal as the primary information source may not be representative of all ML/AI literature. However, the NeurIPS conference is recognised to be amongst the top tier concerning ML/AI studies, so it is reasonable to consider its corpus to be representative of high-quality research. Conclusion: Using the 2017 sample of the NeurIPS supervised learning corpus as an indicator for the quality and trustworthiness of current ML/AI research, it appears that complete argumentative chains in demonstrations of algorithmic effectiveness are rare.

READ FULL TEXT

page 11

page 12

page 13

page 15

research
06/21/2022

The Integration of Machine Learning into Automated Test Generation: A Systematic Literature Review

Context: Machine learning (ML) may enable effective automated test gener...
research
07/26/2021

Systematic Literature Review of Validation Methods for AI Systems

Context: Artificial intelligence (AI) has made its way into everyday act...
research
03/31/2023

Artificial Intelligence in Ovarian Cancer Histopathology: A Systematic Review

Purpose - To characterise and assess the quality of published research e...
research
05/31/2022

Confidence Intervals for Recursive Journal Impact Factors

We compute confidence intervals for recursive impact factors, that take ...
research
07/29/2022

The Effects of Data Quality on Machine Learning Performance

Modern artificial intelligence (AI) applications require large quantitie...
research
04/15/2022

Sources of Irreproducibility in Machine Learning: A Review

Lately, several benchmark studies have shown that the state of the art i...
research
04/10/2019

Survey of ETA prediction methods in public transport networks

The majority of public transport vehicles are fitted with Automatic Vehi...

Please sign up or login with your details

Forgot password? Click here to reset