Population and Empirical PR Curves for Assessment of Ranking Algorithms

The ROC curve is widely used to assess the quality of prediction/classification/ranking algorithms, and its properties have been extensively studied. The precision-recall (PR) curve has become the de facto replacement for the ROC curve in the presence of imbalance, namely where one class is far more likely than the other class. While the PR and ROC curves tend to be used interchangeably, they have some very different properties. Properties of the PR curve are the focus of this paper. We consider: (1) population PR curves, where complete distributional assumptions are specified for scores from both classes; and (2) empirical estimators of the PR curve, where we observe scores and no distributional assumptions are made. The properties have direct consequence on how the PR curve should, and should not, be used. For example, the empirical PR curve is not consistent when scores in the class of primary interest come from discrete distributions. On the other hand, a normal approximation can fit quite well for points on the empirical PR curve from continuously-defined scores, but convergence can be heavily influenced by the distributional setting, the amount of imbalance, and the point of interest on the PR curve.

READ FULL TEXT

page 18

page 19

research
02/08/2022

On the classification of non-aCM curves on quintic hypersurfaces in ℙ^3

In this paper, we call a sub-scheme of dimension one in ℙ^3 a curve. It ...
research
06/13/2018

Your 2 is My 1, Your 3 is My 9: Handling Arbitrary Miscalibrations in Ratings

Cardinal scores (numeric ratings) collected from people are well known t...
research
12/20/2018

Intersections between the norm-trace curve and some low degree curves

In this paper we analyze the intersection between the norm-trace curve o...
research
03/23/2023

On complete m-arcs

Let m be a positive integer and q be a prime power. For large finite bas...
research
01/25/2023

Learning to Rank Normalized Entropy Curves with Differentiable Window Transformation

Recent automated machine learning systems often use learning curves rank...
research
01/11/2023

Fitting Bell Curves to Data Distributions using Visualization

Idealized probability distributions, such as normal or other curves, lie...
research
06/18/2012

Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation

Precision-recall (PR) curves and the areas under them are widely used to...

Please sign up or login with your details

Forgot password? Click here to reset