Master your Metrics with Calibration

09/06/2019
by   Wissam Siblini, et al.
0

Machine learning models deployed in real-world applications are often evaluated with precision-based metrics such as F1-score or AUC-PR (Area Under the Curve of Precision Recall). Heavily dependent on the class prior, such metrics may sometimes lead to wrong conclusions about the performance. For example, when dealing with non-stationary data streams, they do not allow the user to discern the reasons why a model performance varies across different periods. In this paper, we propose a way to calibrate the metrics so that they are no longer tied to the class prior. It corresponds to a readjustment, based on probabilities, to the value that the metric would have if the class prior was equal to a reference prior (user parameter). We conduct a large number of experiments on balanced and imbalanced data to assess the behavior of calibrated metrics and show that they improve interpretability and provide a better control over what is really measured. We describe specific real-world use-cases where calibration is beneficial such as, for instance, model monitoring in production, reporting, or fairness evaluation.

READ FULL TEXT

page 7

page 13

research
01/15/2020

On Model Evaluation under Non-constant Class Imbalance

Many real-world classification problems are significantly class-imbalanc...
research
10/06/2021

Post-hoc Models for Performance Estimation of Machine Learning Inference

Estimating how well a machine learning model performs during inference i...
research
01/02/2022

Succinct Differentiation of Disparate Boosting Ensemble Learning Methods for Prognostication of Polycystic Ovary Syndrome Diagnosis

Prognostication of medical problems using the clinical data by leveragin...
research
05/26/2019

Towards reliable and fair probabilistic predictions: field-aware calibration with neural networks

In machine learning, it is observed that probabilistic predictions somet...
research
12/23/2021

Understanding the impact of class imbalance on the performance of chest x-ray image classifiers

This work aims to understand the impact of class imbalance on the perfor...
research
08/18/2019

Neural Network Based Undersampling Techniques

Class imbalance problem is commonly faced while developing machine learn...
research
02/21/2023

Does the evaluation stand up to evaluation? A first-principle approach to the evaluation of classifiers

How can one meaningfully make a measurement, if the meter does not confo...

Please sign up or login with your details

Forgot password? Click here to reset