ARCH: Efficient Adversarial Regularized Training with Caching

09/15/2021
by   Simiao Zuo, et al.
0

Adversarial regularization can improve model generalization in many natural language processing tasks. However, conventional approaches are computationally expensive since they need to generate a perturbation for each sample in each epoch. We propose a new adversarial regularization method ARCH (adversarial regularization with caching), where perturbations are generated and cached once every several epochs. As caching all the perturbations imposes memory usage concerns, we adopt a K-nearest neighbors-based strategy to tackle this issue. The strategy only requires caching a small amount of perturbations, without introducing additional training time. We evaluate our proposed method on a set of neural machine translation and natural language understanding tasks. We observe that ARCH significantly eases the computational burden (saves up to 70% of computational time in comparison with conventional approaches). More surprisingly, by reducing the variance of stochastic gradients, ARCH produces a notably better (in most of the tasks) or comparable model generalization. Our code is publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2021

Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach

Adversarial training has been shown to improve the generalization perfor...
research
08/29/2021

DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

Adversarial training has been proven to be a powerful regularization met...
research
09/29/2020

A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation

Adversarial training has been shown effective at endowing the learned re...
research
07/13/2020

Generating Fluent Adversarial Examples for Natural Languages

Efficiently building an adversarial attacker for natural language proces...
research
04/30/2020

TextAT: Adversarial Training for Natural Language Understanding with Token-Level Perturbation

Adversarial training is effective in improving the robustness of neural ...
research
02/20/2020

MaxUp: A Simple Way to Improve Generalization of Neural Network Training

We propose MaxUp, an embarrassingly simple, highly effective technique f...
research
06/24/2016

Efficient Parallel Learning of Word2Vec

Since its introduction, Word2Vec and its variants are widely used to lea...

Please sign up or login with your details

Forgot password? Click here to reset