Computational Efficient Approximations of the Concordance Probability in a Big Data Setting

05/21/2021
by   Robin Van Oirbeek, et al.
0

Performance measurement is an essential task once a statistical model is created. The Area Under the receiving operating characteristics Curve (AUC) is the most popular measure for evaluating the quality of a binary classifier. In this case, AUC is equal to the concordance probability, a frequently used measure to evaluate the discriminatory power of the model. Contrary to AUC, the concordance probability can also be extended to the situation with a continuous response variable. Due to the staggering size of data sets nowadays, determining this discriminatory measure requires a tremendous amount of costly computations and is hence immensely time consuming, certainly in case of a continuous response variable. Therefore, we propose two estimation methods that calculate the concordance probability in a fast and accurate way and that can be applied to both the discrete and continuous setting. Extensive simulation studies show the excellent performance and fast computing times of both estimators. Finally, experiments on two real-life data sets confirm the conclusions of the artificial simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2019

Concordance probability in a big data setting: application in non-life insurance

The concordance probability or C-index is a popular measure to capture t...
research
02/18/2020

A Distributionally Robust Area Under Curve Maximization Model

Area under ROC curve (AUC) is a widely used performance measure for clas...
research
06/19/2019

Model-free posterior inference on the area under the receiver operating characteristic curve

The area under the receiver operating characteristic curve (AUC) serves ...
research
10/16/2018

An empirical evaluation of imbalanced data strategies from a practitioner's point of view

This research tested the following well known strategies to deal with bi...
research
03/28/2022

AUC Maximization in the Era of Big Data and AI: A Survey

Area under the ROC curve, a.k.a. AUC, is a measure of choice for assessi...
research
09/04/2020

The Area Under the ROC Curve as a Measure of Clustering Quality

The Area Under the the Receiver Operating Characteristics (ROC) Curve, r...
research
10/31/2019

Connecting population-level AUC and latent scale-invariant R^2 via Semiparametric Gaussian Copula and rank correlations

Area Under the Curve (AUC) is arguably the most popular measure of class...

Please sign up or login with your details

Forgot password? Click here to reset