The Effect of Class Imbalance on Precision-Recall Curves

In this note I study how the precision of a classifier depends on the ratio r of positive to negative cases in the test set, as well as the classifier's true and false positive rates. This relationship allows prediction of how the precision-recall curve will change with r, which seems not to be well known. It also allows prediction of how F_β and the Precision Gain and Recall Gain measures of Flach and Kull (2015) vary with r.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2023

Precision and Recall Reject Curves for Classification

For some classification scenarios, it is desirable to use only those cla...
research
04/09/2018

A plug-in approach to maximising precision at the top and recall at the top

For information retrieval and binary classification, we show that precis...
research
10/02/2018

PromID: human promoter prediction by deep learning

Computational identification of promoters is notoriously difficult as hu...
research
10/11/2020

Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation

Commonly used evaluation measures including Recall, Precision, F-Measure...
research
07/22/2021

Benchmarking AutoML Frameworks for Disease Prediction Using Medical Claims

We ascertain and compare the performances of AutoML tools on large, high...
research
06/18/2012

Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation

Precision-recall (PR) curves and the areas under them are widely used to...
research
04/21/2020

PhishOut: Effective Phishing Detection Using Selected Features

Phishing emails are the first step for many of today's attacks. They com...

Please sign up or login with your details

Forgot password? Click here to reset