Towards Understanding the Data Dependency of Mixup-style Training

10/14/2021
by   Muthu Chidambaram, et al.
0

In the Mixup training paradigm, a model is trained using convex combinations of data points and their associated labels. Despite seeing very few true data points during training, models trained using Mixup seem to still minimize the original empirical risk and exhibit better generalization and robustness on various tasks when compared to standard training. In this paper, we investigate how these benefits of Mixup training rely on properties of the data in the context of classification. For minimizing the original empirical risk, we compute a closed form for the Mixup-optimal classification, which allows us to construct a simple dataset on which minimizing the Mixup loss can provably lead to learning a classifier that does not minimize the empirical loss on the data. On the other hand, we also give sufficient conditions for Mixup training to also minimize the original empirical risk. For generalization, we characterize the margin of a Mixup classifier, and use this to understand why the decision boundary of a Mixup classifier can adapt better to the full structure of the training data when compared to standard training. In contrast, we also show that, for a large class of linear models and linearly separable datasets, Mixup training leads to learning the same classifier as standard training.

READ FULL TEXT

page 24

page 25

research
10/24/2022

Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

Mixup is a data augmentation technique that relies on training using ran...
research
03/28/2018

Supervising Feature Influence

Causal influence measures for machine learnt classifiers shed light on t...
research
02/11/2022

Improving Generalization via Uncertainty Driven Perturbations

Recently Shah et al., 2020 pointed out the pitfalls of the simplicity bi...
research
04/25/2020

Finite-sample analysis of interpolating linear classifiers in the overparameterized regime

We prove bounds on the population risk of the maximum margin algorithm f...
research
12/02/2020

On the Error Resistance of Hinge Loss Minimization

Commonly used classification algorithms in machine learning, such as sup...
research
11/25/2020

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

In real-world classification tasks, each class often comprises multiple ...
research
07/20/2023

Investigating minimizing the training set fill distance in machine learning regression

Many machine learning regression methods leverage large datasets for tra...

Please sign up or login with your details

Forgot password? Click here to reset