Efficient Adversarial Training with Robust Early-Bird Tickets

11/14/2022
by   Zhiheng Xi, et al.
0

Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs). However, this approach is typically more expensive than traditional fine-tuning because of the necessity to generate adversarial examples via gradient descent. Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically 0.15∼0.3 epochs), far before parameters converge. Inspired by this finding, we dig out robust early-bird tickets (i.e., subnetworks) to develop an efficient adversarial training method: (1) searching for robust tickets with structured sparsity in the early stage; (2) fine-tuning robust tickets in the remaining time. To extract the robust tickets as early as possible, we design a ticket convergence metric to automatically terminate the searching process. Experiments show that the proposed efficient adversarial training method can achieve up to 7×∼ 13 × training speedups while maintaining comparable or even better robustness compared to the most competitive state-of-the-art adversarial training methods.

READ FULL TEXT

page 4

page 14

research
06/27/2023

MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning

Fine-tuning large-scale pre-trained language models has been demonstrate...
research
02/26/2020

Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

Adversarial training based on the minimax formulation is necessary for o...
research
04/05/2023

Hyper-parameter Tuning for Adversarially Robust Models

This work focuses on the problem of hyper-parameter tuning (HPT) for rob...
research
04/17/2021

UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection

The real-world impact of polarization and toxicity in the online sphere ...
research
12/25/2020

A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning

Adversarial Training (AT) with Projected Gradient Descent (PGD) is an ef...
research
08/13/2020

Adversarial Training and Provable Robustness: A Tale of Two Objectives

We propose a principled framework that combines adversarial training and...
research
04/28/2022

Improving robustness of language models from a geometry-aware perspective

Recent studies have found that removing the norm-bounded projection and ...

Please sign up or login with your details

Forgot password? Click here to reset