Performance Metrics for Probabilistic Ordinal Classifiers

09/15/2023
by   Adrian Galdran, et al.
0

Ordinal classification models assign higher penalties to predictions further away from the true class. As a result, they are appropriate for relevant diagnostic tasks like disease progression prediction or medical image grading. The consensus for assessing their categorical predictions dictates the use of distance-sensitive metrics like the Quadratic-Weighted Kappa score or the Expected Cost. However, there has been little discussion regarding how to measure performance of probabilistic predictions for ordinal classifiers. In conventional classification, common measures for probabilistic predictions are Proper Scoring Rules (PSR) like the Brier score, or Calibration Errors like the ECE, yet these are not optimal choices for ordinal classification. A PSR named Ranked Probability Score (RPS), widely popular in the forecasting field, is more suitable for this task, but it has received no attention in the image analysis community. This paper advocates the use of the RPS for image grading tasks. In addition, we demonstrate a counter-intuitive and questionable behavior of this score, and propose a simple fix for it. Comprehensive experiments on four large-scale biomedical image grading problems over three different datasets show that the RPS is a more suitable performance metric for probabilistic ordinal predictions. Code to reproduce our experiments can be found at https://github.com/agaldran/prob_ord_metrics .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2023

From Classification Accuracy to Proper Scoring Rules: Elicitability of Probabilistic Top List Predictions

In the face of uncertainty, the need for probabilistic assessments has l...
research
07/18/2023

Ord2Seq: Regarding Ordinal Regression as Label Sequence Prediction

Ordinal regression refers to classifying object instances into ordinal c...
research
09/01/2020

Performance-Agnostic Fusion of Probabilistic Classifier Outputs

We propose a method for combining probabilistic outputs of classifiers t...
research
03/04/2021

Prostate Tissue Grading with Deep Quantum Measurement Ordinal Regression

Prostate cancer (PCa) is one of the most common and aggressive cancers w...
research
11/12/2021

Monte Carlo dropout increases model repeatability

The integration of artificial intelligence into clinical workflows requi...
research
06/02/2022

Rashomon Capacity: A Metric for Predictive Multiplicity in Probabilistic Classification

Predictive multiplicity occurs when classification models with nearly in...
research
09/12/2022

Analysis and Comparison of Classification Metrics

A number of different performance metrics are commonly used in the machi...

Please sign up or login with your details

Forgot password? Click here to reset