Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

04/27/2020
by   Sanyuan Chen, et al.
0

Deep pretrained language models have achieved great success in the way of pretraining first and then fine-tuning. But such a sequential transfer learning paradigm often confronts the catastrophic forgetting problem and leads to sub-optimal performance. To fine-tune with less forgetting, we propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks. Specifically, we propose a Pretraining Simulation mechanism to recall the knowledge from pretraining tasks without data, and an Objective Shifting mechanism to focus the learning on downstream tasks gradually. Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark. Our method also enables BERT-base to achieve better performance than directly fine-tuning of BERT-large. Further, we provide the open-source RecAdam optimizer, which integrates the proposed mechanisms into Adam optimizer, to facility the NLP community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2021

Robust Transfer Learning with Pretrained Language Models through Adapters

Transfer learning with large pretrained transformer-based language model...
research
05/20/2023

Lifelong Language Pretraining with Distribution-Specialized Experts

Pretraining on a large-scale corpus has become a standard method to buil...
research
07/03/2023

Improving Language Plasticity via Pretraining with Active Forgetting

Pretrained language models (PLMs) are today the primary model for natura...
research
01/12/2023

Language-Informed Transfer Learning for Embodied Household Activities

For service robots to become general-purpose in everyday household envir...
research
03/14/2019

To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks

While most previous work has focused on different pretraining objectives...
research
06/04/2022

Instance-wise Prompt Tuning for Pretrained Language Models

Prompt Learning has recently gained great popularity in bridging the gap...
research
08/01/2022

giMLPs: Gate with Inhibition Mechanism in MLPs

This paper presents a new model architecture, gate with inhibition MLP (...

Please sign up or login with your details

Forgot password? Click here to reset