On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

05/27/2019
by   Sunil Thulasidasan, et al.
0

Mixup zhang2017mixup is a recently proposed method for training deep neural networks where additional samples are generated during training by convexly combining random pairs of images and their associated labels. While simple to implement, it has shown to be a surprisingly effective method of data augmentation for image classification; DNNs trained with mixup show noticeable gains in classification performance on a number of image classification benchmarks. In this work, we discuss a hitherto untouched aspect of mixup training -- the calibration and predictive uncertainty of models trained with mixup. We find that DNNs trained with mixup are significantly better calibrated -- i.e., the predicted softmax scores are much better indicators of the actual likelihood of a correct prediction -- than DNNs trained in the regular fashion. We conduct experiments on a number of image classification architectures and datasets -- including large-scale datasets like ImageNet -- and find this to be the case. Additionally, we find that merely mixing features does not result in the same calibration benefit and that the label smoothing in mixup training plays a significant role in improving calibration. Finally, we also observe that mixup-trained DNNs are less prone to over-confident predictions on out-of-distribution and random-noise data. We conclude that the typical overconfidence seen in neural networks, even on in-distribution data is likely a consequence of training with hard labels, suggesting that mixup training be employed for classification tasks where predictive uncertainty is a significant concern.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2022

Bayesian Neural Network Versus Ex-Post Calibration For Prediction Uncertainty

Probabilistic predictions from neural networks which account for predict...
research
10/11/2021

Instance-based Label Smoothing For Better Calibrated Classification Networks

Label smoothing is widely used in deep neural networks for multi-class c...
research
06/01/2023

A Uniform Confidence Phenomenon in Deep Learning and its Implications for Calibration

Despite the impressive generalization capabilities of deep neural networ...
research
05/26/2022

TransBoost: Improving the Best ImageNet Performance using Deep Transduction

This paper deals with deep transductive learning, and proposes TransBoos...
research
10/24/2020

PEP: Parameter Ensembling by Perturbation

Ensembling is now recognized as an effective approach for increasing the...
research
12/27/2022

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Mixup is a popular data augmentation technique for training deep neural ...
research
02/11/2021

Sample Efficient Learning of Image-Based Diagnostic Classifiers Using Probabilistic Labels

Deep learning approaches often require huge datasets to achieve good gen...

Please sign up or login with your details

Forgot password? Click here to reset