Multi-Sample ζ-mixup: Richer, More Realistic Synthetic Samples from a p-Series Interpolant

04/07/2022
by   Kumar Abhishek, et al.
0

Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose ζ-mixup, a generalization of mixup with provably and demonstrably desirable properties that allows convex combinations of N ≥ 2 samples, leading to more realistic and diverse outputs that incorporate information from N original samples by using a p-series interpolant. We show that, compared to mixup, ζ-mixup better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of ζ-mixup is faster than mixup, and extensive evaluation on controlled synthetic and 24 real-world natural and medical image classification datasets shows that ζ-mixup outperforms mixup and traditional data augmentation techniques.

READ FULL TEXT

page 6

page 19

research
01/09/2018

Data Augmentation by Pairing Samples for Images Classification

Data augmentation is a widely used technique in many machine learning ta...
research
03/24/2023

Towards Diverse and Coherent Augmentation for Time-Series Forecasting

Time-series data augmentation mitigates the issue of insufficient traini...
research
05/25/2022

ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation

Data augmentation (DA) is a widely used technique for enhancing the trai...
research
07/25/2022

Efficient Classification with Counterfactual Reasoning and Active Learning

Data augmentation is one of the most successful techniques to improve th...
research
08/06/2021

Ensemble Augmentation for Deep Neural Networks Using 1-D Time Series Vibration Data

Time-series data are one of the fundamental types of raw data representa...
research
06/27/2020

Stochastic Batch Augmentation with An Effective Distilled Dynamic Soft Label Regularizer

Data augmentation have been intensively used in training deep neural net...
research
02/16/2023

Defect Transfer GAN: Diverse Defect Synthesis for Data Augmentation

Data-hunger and data-imbalance are two major pitfalls in many deep learn...

Please sign up or login with your details

Forgot password? Click here to reset