Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers

02/28/2019
by   Hiva Ghanbari, et al.
0

The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction accuracy or the so-called Area Under the Curve (AUC). Minimizing the reciprocals of these measures are the goals of supervised learning. However, when the models are constructed by the means of empirical risk minimization (ERM), surrogate functions such as the logistic loss or hinge loss are optimized instead. In this work, we show that in the case of linear predictors, the expected error and the expected ranking loss can be effectively approximated by smooth functions whose closed form expressions and those of their first (and second) order derivatives depend on the first and second moments of the data distribution, which can be precomputed. Hence, the complexity of an optimization algorithm applied to these functions does not depend on the size of the training data. These approximation functions are derived under the assumption that the output of the linear classifier for a given data set has an approximately normal distribution. We argue that this assumption is significantly weaker than the Gaussian assumption on the data itself and we support this claim by demonstrating that our new approximation is quite accurate on data sets that are not necessarily Gaussian. We present computational results that show that our proposed approximations and related optimization algorithms can produce linear classifiers with similar or better test accuracy or AUC, than those obtained using state-of-the-art approaches, in a fraction of the time.

READ FULL TEXT
research
02/07/2018

Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers

The predictive quality of machine learning models is typically measured ...
research
09/30/2020

First-order Optimization for Superquantile-based Supervised Learning

Classical supervised learning via empirical risk (or negative log-likeli...
research
04/16/2018

A Univariate Bound of Area Under ROC

Area under ROC (AUC) is an important metric for binary classification an...
research
08/23/2019

Bayesian Receiver Operating Characteristic Metric for Linear Classifiers

We propose a novel classifier accuracy metric: the Bayesian Area Under t...
research
11/07/2022

Highly over-parameterized classifiers generalize since bad solutions are rare

We study the generalization of over-parameterized classifiers where Empi...
research
04/14/2023

Performative Prediction with Neural Networks

Performative prediction is a framework for learning models that influenc...
research
01/29/2022

A Stochastic Bundle Method for Interpolating Networks

We propose a novel method for training deep neural networks that are cap...

Please sign up or login with your details

Forgot password? Click here to reset