Evaluating the incremental value of a new model: Area under the ROC curve or under the PR curve

10/19/2020
by   Qian M. Zhou, et al.
0

Incremental value (IncV) evaluates the performance improvement from an existing risk model to a new model. In this paper, we compare the IncV of the area under the receiver operating characteristic curve (IncV-AUC) and the IncV of the area under the precision-recall curve (IncV-AP). Since they are both semi-proper scoring rules, we also compare them with a strictly proper scoring rule: the IncV of the scaled Brier score (IncV-sBrS). The comparisons are demonstrated via a numerical study under various event rates. The results show that the IncV-AP and IncV-sBrS are highly consistent, but the IncV-AUC and the IncV-sBrS are negatively correlated at a low event rate. The IncV-AUC and IncV-AP are the least consistent among the three pairs, and their differences are more pronounced as the event rate decreases. To investigate this phenomenon, we derive the expression of these two metrics. Both are weighted averages of the changes (from the existing model to the new one) in the separation of the risk score distributions between events and non-events. However, the IncV-AP assigns heavier weights to the changes in the higher risk group, while the IncV-AUC weighs the entire population equally. We further illustrate this point via a data example of two risk models for predicting acute ovarian failure. The new model has a slightly lower AUC but increases the AP by 48 group, the IncV-AP is a more appropriate metric, especially when the event rate is low.

READ FULL TEXT

page 19

page 20

page 21

page 22

page 23

research
06/08/2020

A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account

Receiver operating characteristic (ROC) curve is an informative tool in ...
research
12/12/2011

Threshold Choice Methods: the Missing Link

Many performance metrics have been introduced for the evaluation of clas...
research
06/03/2021

Machine Learning Based Texture Analysis of Patella from X-Rays for Detecting Patellofemoral Osteoarthritis

Objective is to assess the ability of texture features for detecting rad...
research
03/21/2021

Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation

Optimal performance is critical for decision-making tasks from medicine ...
research
08/21/2019

Adaptive Segmentation of Knee Radiographs for Selecting the Optimal ROI in Texture Analysis

The purposes of this study were to investigate: 1) the effect of placeme...
research
03/31/2023

Resolving power: A general approach to compare the discriminating capacity of threshold-free evaluation metrics

This paper introduces the concept of resolving power to describe the cap...

Please sign up or login with your details

Forgot password? Click here to reset