The Optimal Input-Independent Baseline for Binary Classification: The Dutch Draw

01/09/2023
by   Joris Pries, et al.
0

Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is `good' or `bad'. The goal of this paper is to examine all baseline methods that are independent of feature values and determine which model is the `best' and why. By identifying which baseline models are optimal, a crucial selection decision in the evaluation process is simplified. We prove that the recently proposed Dutch Draw baseline is the best input-independent classifier (independent of feature values) for all positional-invariant measures (independent of sequence order) assuming that the samples are randomly shuffled. This means that the Dutch Draw baseline is the optimal baseline under these intuitive requirements and should therefore be used in practice.

READ FULL TEXT
research
03/24/2022

The Dutch Draw: Constructing a Universal Baseline for Binary Prediction Models

Novel prediction methods should always be compared to a baseline to know...
research
07/24/2021

A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification

This paper presents the intrinsic limit determination algorithm (ILD Alg...
research
12/22/2021

Classifier Data Quality: A Geometric Complexity Based Method for Automated Baseline And Insights Generation

Testing Machine Learning (ML) models and AI-Infused Applications (AIIAs)...
research
06/02/2018

Binary Classification with Karmic, Threshold-Quasi-Concave Metrics

Complex performance measures, beyond the popular measure of accuracy, ar...
research
05/22/2021

Learning Baseline Values for Shapley Values

This paper aims to formulate the problem of estimating the optimal basel...
research
06/08/2020

A Baseline for Shapely Values in MLPs: from Missingness to Neutrality

Being able to explain a prediction as well as having a model that perfor...
research
08/19/2021

Optimally Efficient Sequential Calibration of Binary Classifiers to Minimize Classification Error

In this work, we aim to calibrate the score outputs of an estimator for ...

Please sign up or login with your details

Forgot password? Click here to reset