Does the evaluation stand up to evaluation? A first-principle approach to the evaluation of classifiers

02/21/2023
by   K. Dyrland, et al.
0

How can one meaningfully make a measurement, if the meter does not conform to any standard and its scale expands or shrinks depending on what is measured? In the present work it is argued that current evaluation practices for machine-learning classifiers are affected by this kind of problem, leading to negative consequences when classifiers are put to real use; consequences that could have been avoided. It is proposed that evaluation be grounded on Decision Theory, and the implications of such foundation are explored. The main result is that every evaluation metric must be a linear combination of confusion-matrix elements, with coefficients - "utilities" - that depend on the specific classification problem. For binary classification, the space of such possible metrics is effectively two-dimensional. It is shown that popular metrics such as precision, balanced accuracy, Matthews Correlation Coefficient, Fowlkes-Mallows index, F1-measure, and Area Under the Curve are never optimal: they always give rise to an in-principle avoidable fraction of incorrect evaluations. This fraction is even larger than would be caused by the use of a decision-theoretic metric with moderately wrong coefficients.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2015

Optimal Decision-Theoretic Classification Using Non-Decomposable Performance Metrics

We provide a general theoretical analysis of expected out-of-sample util...
research
05/10/2023

Pearson-Matthews correlation coefficients for binary and multinary classification and hypothesis testing

The Pearson-Matthews correlation coefficient (usually abbreviated MCC) i...
research
06/07/2023

Machine-Learning Kronecker Coefficients

The Kronecker coefficients are the decomposition multiplicities of the t...
research
10/16/2018

An empirical evaluation of imbalanced data strategies from a practitioner's point of view

This research tested the following well known strategies to deal with bi...
research
02/20/2020

The Problem with Metrics is a Fundamental Problem for AI

Optimizing a given metric is a central aspect of most current AI approac...
research
09/06/2019

Master your Metrics with Calibration

Machine learning models deployed in real-world applications are often ev...
research
04/12/2021

Understanding Prediction Discrepancies in Machine Learning Classifiers

A multitude of classifiers can be trained on the same data to achieve si...

Please sign up or login with your details

Forgot password? Click here to reset