DAIR: Data Augmented Invariant Regularization

10/21/2021
by   Tianjian Huang, et al.
0

While deep learning through empirical risk minimization (ERM) has succeeded at achieving human-level performance at a variety of complex tasks, ERM generalizes poorly to distribution shift. This is partly explained by overfitting to spurious features such as background in images or named entities in natural language. Synthetic data augmentation followed by empirical risk minimization (DA-ERM) is a simple yet powerful solution to remedy this problem. In this paper, we propose data augmented invariant regularization (DAIR). The idea of DAIR is based on the observation that the model performance (loss) is desired to be consistent on the augmented sample and the original one. DAIR introduces a regularizer on DA-ERM to penalize such loss inconsistency. Both theoretically and through empirical experiments, we show that a particular form of the DAIR regularizer consistently performs well in a variety of settings. We apply it to multiple real-world learning problems involving domain shift, namely robust regression, visual question answering, robust deep neural network training, and task-oriented dialog modeling. Our experiments show that DAIR consistently outperforms ERM and DA-ERM with little marginal cost and setting new state-of-the-art results in several benchmarks.

READ FULL TEXT
research
02/16/2022

A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments

Data-Augmentation (DA) is known to improve performance across tasks and ...
research
02/24/2022

Sample Efficiency of Data Augmentation Consistency Regularization

Data augmentation is popular in the training of large neural networks; c...
research
07/18/2022

Rethinking Data Augmentation for Robust Visual Question Answering

Data Augmentation (DA) – generating extra training samples beyond origin...
research
03/17/2022

When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation

Data Augmentation (DA) is known to improve the generalizability of deep ...
research
09/19/2023

Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms

Domain adaptation (DA) is a statistical learning problem that arises whe...
research
02/03/2023

Contrastive Learning with Consistent Representations

Contrastive learning demonstrates great promise for representation learn...
research
05/25/2023

Visualizing data augmentation in deep speaker recognition

Visualization is of great value in understanding the internal mechanisms...

Please sign up or login with your details

Forgot password? Click here to reset