AdaFocal: Calibration-aware Adaptive Focal Loss

11/21/2022
by   Arindam Ghosh, et al.
0

Much recent work has been devoted to the problem of ensuring that a neural network's confidence scores match the true probability of being correct, i.e. the calibration problem. Of note, it was found that training with focal loss leads to better calibration than cross-entropy while achieving similar level of accuracy <cit.>. This success stems from focal loss regularizing the entropy of the model's prediction (controlled by the parameter γ), thereby reining in the model's overconfidence. Further improvement is expected if γ is selected independently for each training sample (Sample-Dependent Focal Loss (FLSD-53) <cit.>). However, FLSD-53 is based on heuristics and does not generalize well. In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies γ_t for different groups of samples based on γ_t-1 from the previous step and the knowledge of model's under/over-confidence on the validation set. We evaluate AdaFocal on various image recognition and one NLP task, covering a wide variety of network architectures, to confirm the improvement in calibration while achieving similar levels of accuracy. Additionally, we show that models trained with AdaFocal achieve a significant boost in out-of-distribution detection.

READ FULL TEXT

page 27

page 28

page 29

page 30

page 31

research
10/28/2022

Beyond calibration: estimating the grouping loss of modern neural networks

Good decision making requires machine-learning models to provide trustwo...
research
05/25/2022

Revisiting Calibration for Question Answering

Model calibration aims to adjust (calibrate) models' confidence so that ...
research
03/17/2022

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

Data-driven methods have achieved notable performance on intent detectio...
research
05/04/2023

Conformal Nucleus Sampling

Language models generate text based on successively sampling the next wo...
research
02/21/2020

Calibrating Deep Neural Networks using Focal Loss

Miscalibration – a mismatch between a model's confidence and its correct...
research
03/13/2023

Context-Aware Selective Label Smoothing for Calibrating Sequence Recognition Model

Despite the success of deep neural network (DNN) on sequential data (i.e...
research
05/12/2023

Calibration-Aware Bayesian Learning

Deep learning models, including modern systems like large language model...

Please sign up or login with your details

Forgot password? Click here to reset