On the Use of Default Parameter Settings in the Empirical Evaluation of Classification Algorithms

03/20/2017
by   Anthony Bagnall, et al.
0

We demonstrate that, for a range of state-of-the-art machine learning algorithms, the differences in generalisation performance obtained using default parameter settings and using parameters tuned via cross-validation can be similar in magnitude to the differences in performance observed between state-of-the-art and uncompetitive learning systems. This means that fair and rigorous evaluation of new learning algorithms requires performance comparison against benchmark methods with best-practice model selection procedures, rather than using default parameter settings. We investigate the sensitivity of three key machine learning algorithms (support vector machine, random forest and rotation forest) to their default parameter settings, and provide guidance on determining sensible default parameter values for implementations of these algorithms. We also conduct an experimental comparison of these three algorithms on 121 classification problems and find that, perhaps surprisingly, rotation forest is significantly more accurate on average than both random forest and a support vector machine.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2019

Solver Recommendation For Transport Problems in Slabs Using Machine Learning

The use of machine learning algorithms to address classification problem...
research
01/08/2019

Artificial Intelligence and Machine Learning to Predict and Improve Efficiency in Manufacturing Industry

The overall equipment effectiveness (OEE) is a performance measurement m...
research
04/16/2018

conformalClassification: A Conformal Prediction R Package for Classification

The conformalClassification package implements Transductive Conformal Pr...
research
01/31/2018

The Impact of Automated Parameter Optimization on Defect Prediction Models

Defect prediction models---classifiers that identify defect-prone softwa...
research
01/26/2021

Average Localised Proximity: a new data descriptor with good default one-class classification performance

One-class classification is a challenging subfield of machine learning i...
research
09/18/2018

Is rotation forest the best classifier for problems with continuous features?

Rotation forest is a tree based ensemble that performs transforms on sub...
research
02/11/2020

Improved prediction of soil properties with Multi-target Stacked Generalisation on EDXRF spectra

Machine Learning (ML) algorithms have been used for assessing soil quali...

Please sign up or login with your details

Forgot password? Click here to reset