ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation

05/25/2022
by   Chenyang Wang, et al.
0

Data augmentation (DA) is a widely used technique for enhancing the training of deep neural networks. Recent DA techniques which achieve state-of-the-art performance always meet the need for diversity in augmented training samples. However, an augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples and these samples consequently impair the performance. To alleviate this issue, we propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them. To be specific, we first use a Gaussian mixture model to fit the loss distribution of both the original and augmented samples and accordingly split these samples into in-distribution (ID) samples and OOD samples. Then we start a new training where ID and OOD samples are incorporated with different smooth labels. By treating ID samples and OOD samples unequally, we can make better use of the diverse augmented data. Further, we incorporate our ReSmooth framework with negative data augmentation strategies. By properly handling their intentionally created ODD samples, the classification performance of negative data augmentations is largely ameliorated. Experiments on several classification benchmarks show that ReSmooth can be easily extended to existing augmentation strategies (such as RandAugment, rotate, and jigsaw) and improve on them.

READ FULL TEXT
research
03/17/2022

When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation

Data Augmentation (DA) is known to improve the generalizability of deep ...
research
11/05/2021

Increasing Data Diversity with Iterative Sampling to Improve Performance

As a part of the Data-Centric AI Competition, we propose a data-centric ...
research
06/09/2022

OOD Augmentation May Be at Odds with Open-Set Recognition

Despite advances in image classification methods, detecting the samples ...
research
04/04/2023

PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification

Modern data augmentation using a mixture-based technique can regularize ...
research
04/07/2022

Multi-Sample ζ-mixup: Richer, More Realistic Synthetic Samples from a p-Series Interpolant

Modern deep learning training procedures rely on model regularization te...
research
06/09/2020

Towards Good Practices for Data Augmentation in GAN Training

Recent successes in Generative Adversarial Networks (GAN) have affirmed ...
research
07/18/2022

Rethinking Data Augmentation for Robust Visual Question Answering

Data Augmentation (DA) – generating extra training samples beyond origin...

Please sign up or login with your details

Forgot password? Click here to reset