Curricular Transfer Learning for Sentence Encoded Tasks

Fine-tuning language models in a downstream task is the standard approach for many state-of-the-art methodologies in the field of NLP. However, when the distribution between the source task and target task drifts, e.g., conversational environments, these gains tend to be diminished. This article proposes a sequence of pre-training steps (a curriculum) guided by "data hacking" and grammar analysis that allows further gradual adaptation between pre-training distributions. In our experiments, we acquire a considerable improvement from our method compared to other known pre-training approaches for the MultiWoZ task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

The Role of Pre-training Data in Transfer Learning

The transfer learning paradigm of model pre-training and subsequent fine...
research
05/02/2023

Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner

Language models (LMs) trained on vast quantities of unlabelled data have...
research
01/10/2021

Transfer Learning and Augmentation for Word Sense Disambiguation

Many downstream NLP tasks have shown significant improvement through con...
research
09/07/2022

Blessing of Class Diversity in Pre-training

This paper presents a new statistical analysis aiming to explain the rec...
research
11/03/2020

Meta-learning Transferable Representations with a Single Target Domain

Recent works found that fine-tuning and joint training—two popular appro...
research
02/28/2023

Towards Better Web Search Performance: Pre-training, Fine-tuning and Learning to Rank

This paper describes the approach of the THUIR team at the WSDM Cup 2023...
research
06/21/2021

Does Optimal Source Task Performance Imply Optimal Pre-training for a Target Task?

Pre-trained deep nets are commonly used to improve accuracies and traini...

Please sign up or login with your details

Forgot password? Click here to reset