ROSE: Robust Selective Fine-tuning for Pre-trained Language Models

10/18/2022
by   Lan Jiang, et al.
0

Even though the large-scale language models have achieved excellent performances, they suffer from various adversarial attacks. A large body of defense methods has been proposed. However, they are still limited due to redundant attack search spaces and the inability to defend against various types of attacks. In this work, we present a novel fine-tuning approach called RObust SEletive fine-tuning (ROSE) to address this issue. ROSE conducts selective updates when adapting pre-trained models to downstream tasks, filtering out invaluable and unrobust updates of parameters. Specifically, we propose two strategies: the first-order and second-order ROSE for selecting target robust parameters. The experimental results show that ROSE achieves significant improvements in adversarial robustness on various downstream NLP tasks, and the ensemble method even surpasses both variants above. Furthermore, ROSE can be easily incorporated into existing fine-tuning methods to improve their adversarial robustness further. The empirical analysis confirms that ROSE eliminates unrobust spurious updates during fine-tuning, leading to solutions corresponding to flatter and wider optima than the conventional method. Code is available at <https://github.com/jiangllan/ROSE>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2022

Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively

Large-scale pre-trained language models have achieved impressive results...
research
12/22/2021

How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

The fine-tuning of pre-trained language models has a great success in ma...
research
11/10/2022

MSDT: Masked Language Model Scoring Defense in Text Domain

Pre-trained language models allowed us to process downstream tasks with ...
research
09/13/2021

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models

Recent works have shown that powerful pre-trained language models (PLM) ...
research
09/20/2023

Are Large Language Models Really Robust to Word-Level Perturbations?

The swift advancement in the scale and capabilities of Large Language Mo...
research
10/28/2022

RoChBert: Towards Robust BERT Fine-tuning for Chinese

Despite of the superb performance on a wide range of tasks, pre-trained ...
research
03/13/2023

Model-tuning Via Prompts Makes NLP Models Adversarially Robust

In recent years, NLP practitioners have converged on the following pract...

Please sign up or login with your details

Forgot password? Click here to reset