The f-divergence and Loss Functions in ROC Curve
Given two data distributions and a test score function, the Receiver Operating Characteristic (ROC) curve shows how well such a score separates two distributions. However, can the ROC curve be used as a measure of discrepancy between two distributions? This paper shows that when the data likelihood ratio is used as the test score, the arc length of the ROC curve gives rise to a novel f-divergence measuring the differences between two data distributions. Approximating this arc length using a variational objective and empirical samples leads to empirical risk minimization with previously unknown loss functions. We provide a Lagrangian dual objective and introduce kernel models into the estimation problem. We study the non-parametric convergence rate of this estimator and show under mild smoothness conditions of the real arctangent density ratio function, the rate of convergence is O_p(n^-β/4) (β∈ (0,1] depends on the smoothness).
READ FULL TEXT