Sample Efficiency of Data Augmentation Consistency Regularization

02/24/2022
by   Shuo Yang, et al.
7

Data augmentation is popular in the training of large neural networks; currently, however, there is no clear theoretical comparison between different algorithmic choices on how to use augmented data. In this paper, we take a step in this direction - we first present a simple and novel analysis for linear regression, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). We then propose a new theoretical framework for analyzing DAC, which reframes DAC as a way to reduce function class complexity. The new framework characterizes the sample efficiency of DAC for various non-linear models (e.g., neural networks). Further, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between ERM and consistency regularization using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.

READ FULL TEXT
research
09/19/2019

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Data augmentation has been widely applied as an effective methodology to...
research
05/25/2022

Augmentation-induced Consistency Regularization for Classification

Deep neural networks have become popular in many supervised learning tas...
research
10/21/2021

DAIR: Data Augmented Invariant Regularization

While deep learning through empirical risk minimization (ERM) has succee...
research
11/25/2020

Squared ℓ_2 Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations

Data augmentation is one of the most popular techniques for improving th...
research
05/03/2021

Consistency and Monotonicity Regularization for Neural Knowledge Tracing

Knowledge Tracing (KT), tracking a human's knowledge acquisition, is a c...
research
06/10/2020

On Mixup Regularization

Mixup is a data augmentation technique that creates new examples as conv...
research
03/13/2022

On Data Augmentation in Point Process Models Based on Thinning

Many models for point process data are defined through a thinning proced...

Please sign up or login with your details

Forgot password? Click here to reset