Improving selective classification performance of deep neural networks through post-hoc logit normalization and temperature scaling
This paper addresses the problem of selective classification for deep neural networks, where a model is allowed to abstain from low-confidence predictions to avoid potential errors. Specifically, we tackle the problem of optimizing the confidence estimator of a fixed classifier, aiming to enhance its misclassification detection performance, i.e., its ability to discriminate between correct and incorrect predictions by assigning higher confidence values to the correct ones. Previous work has found that different classifiers exhibit varying levels of misclassification detection performance, particularly when using the maximum softmax probability (MSP) as a measure of confidence. However, we argue that these findings are mainly due to a sub-optimal confidence estimator being used for each model. To overcome this issue, we propose a simple and efficient post-hoc confidence estimator, named p-NormSoftmax, which consists of transforming the logits through p-norm normalization and temperature scaling, followed by taking the MSP, where p and the temperature are optimized based on a hold-out set. This estimator can be easily applied on top of an already trained model and, in many cases, can significantly improve its selective classification performance. When applied to 84 pretrained Imagenet classifiers, our method yields an average improvement of 16 models. Furthermore, after applying p-NormSoftmax, we observe that these models exhibit approximately the same level of misclassification detection performance, implying that a model's selective classification performance is almost entirely determined by its accuracy at full coverage.
READ FULL TEXT