On Mixup Regularization

06/10/2020
by   Luigi Carratino, et al.
0

Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretical foundations of Mixup, by clarifying its regularization effects. We show that Mixup can be interpreted as standard empirical risk minimization estimator subject to a combination of data transformation and random perturbation of the transformed data. We further show that these transformations and perturbations induce multiple known regularization schemes, including label smoothing and reduction of the Lipschitz constant of the estimator, and that these schemes interact synergistically with each other, resulting in a self calibrated and effective regularization effect that prevents overfitting and overconfident predictions. We illustrate our theoretical analysis by experiments that empirically support our conclusions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2022

Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

Mixup is a data augmentation technique that relies on training using ran...
research
05/02/2020

On the Generalization Effects of Linear Transformations in Data Augmentation

Data augmentation is a powerful technique to improve performance in appl...
research
10/09/2020

How Does Mixup Help With Robustness and Generalization?

Mixup is a popular data augmentation technique based on taking convex co...
research
02/24/2022

Sample Efficiency of Data Augmentation Consistency Regularization

Data augmentation is popular in the training of large neural networks; c...
research
04/13/2023

Understanding Overfitting in Adversarial Training via Kernel Regression

Adversarial training and data augmentation with noise are widely adopted...
research
01/07/2020

Regularization via Structural Label Smoothing

Regularization is an effective way to promote the generalization perform...
research
12/19/2018

Max-Diversity Distributed Learning: Theory and Algorithms

We study the risk performance of distributed learning for the regulariza...

Please sign up or login with your details

Forgot password? Click here to reset