Regularization via Adaptive Pairwise Label Smoothing

12/02/2020
by   Hongyu Guo, et al.
0

Label Smoothing (LS) is an effective regularizer to improve the generalization of state-of-the-art deep models. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over the non ground-truth classes, aiming to penalize the networks from generating overconfident output distributions. This paper introduces a novel label smoothing technique called Pairwise Label Smoothing (PLS). The PLS takes a pair of samples as input. Smoothing with a pair of ground-truth labels enables the PLS to preserve the relative distance between the two truth labels while further soften that between the truth labels and the other targets, resulting in models producing much less confident predictions than the LS strategy. Also, unlike current LS methods, which typically require to find a global smoothing distribution mass through cross-validation search, PLS automatically learns the distribution mass for each input pair during training. We empirically show that PLS significantly outperforms LS and the baseline models, achieving up to 30 reduction. We also visually show that when achieving such accuracy gains the PLS tends to produce very low winning softmax scores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2021

Midpoint Regularization: from High Uncertainty Training to Conservative Classification

Label Smoothing (LS) improves model generalization through penalizing mo...
research
12/15/2020

Weakly Supervised Label Smoothing

We study Label Smoothing (LS), a widely used regularization technique, i...
research
01/07/2020

Regularization via Structural Label Smoothing

Regularization is an effective way to promote the generalization perform...
research
01/29/2023

Confidence-Aware Calibration and Scoring Functions for Curriculum Learning

Despite the great success of state-of-the-art deep neural networks, seve...
research
03/29/2022

Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing

We introduce two techniques, length perturbation and n-best based label ...
research
06/25/2020

Epoch-evolving Gaussian Process Guided Learning

In this paper, we propose a novel learning scheme called epoch-evolving ...
research
07/23/2021

Similarity Based Label Smoothing For Dialogue Generation

Generative neural conversational systems are generally trained with the ...

Please sign up or login with your details

Forgot password? Click here to reset