Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively

11/03/2022
by   Haojie Zhang, et al.
0

Large-scale pre-trained language models have achieved impressive results on a wide range of downstream tasks recently. However, fine-tuning an extremely large-scale pre-trained language model on limited target datasets is often plagued by overfitting and representation degradation. In this paper, we propose a Dynamic Parameter Selection (DPS) algorithm for the large-scale pre-trained models during fine-tuning, which adaptively selects a more promising subnetwork to perform staging updates based on gradients of back-propagation. Experiments on the GLUE benchmark show that DPS outperforms previous fine-tuning methods in terms of overall performance and stability, and consistently achieves better results with variable pre-trained language models. In addition, DPS brings a large magnitude of improvement in out-of-domain transferring experiments and low-resource scenarios, which shows that it can maintain stable general contextual features and reduce the representation collapse. We release our code at https://github.com/ZhangHaojie077/DPS

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2023

PEFTT: Parameter-Efficient Fine-Tuning for low-resource Tibetan pre-trained language models

In this era of large language models (LLMs), the traditional training of...
research
10/18/2022

ROSE: Robust Selective Fine-tuning for Pre-trained Language Models

Even though the large-scale language models have achieved excellent perf...
research
02/07/2021

CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models

Fine-tuning pre-trained language models (PLMs) has demonstrated its effe...
research
04/29/2023

POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

Through prompting, large-scale pre-trained models have become more expre...
research
08/16/2023

LLM4TS: Two-Stage Fine-Tuning for Time-Series Forecasting with Pre-Trained LLMs

In this work, we leverage pre-trained Large Language Models (LLMs) to en...
research
05/20/2023

AnyPredict: Foundation Model for Tabular Prediction

Foundation models are pre-trained on massive data to perform well across...
research
07/28/2022

Large Language Models and the Reverse Turing Test

Large Language Models (LLMs) have been transformative. They are pre-trai...

Please sign up or login with your details

Forgot password? Click here to reset