Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

by   Colin Raffel, et al.

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new "Colossal Clean Crawled Corpus", we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.


An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese

Text classification approaches have usually required task-specific model...

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

Transfer learning and ensembling are two popular techniques for improvin...

Pushing the Limits of AMR Parsing with Self-Learning

Abstract Meaning Representation (AMR) parsing has experienced a notable ...

Cross-Dataset Design Discussion Mining

Being able to identify software discussions that are primarily about des...

Exploring the Limits of Transfer Learning with Unified Model in the Cybersecurity Domain

With the increase in cybersecurity vulnerabilities of software systems, ...

Investigating Numeracy Learning Ability of a Text-to-Text Transfer Model

The transformer-based pre-trained language models have been tremendously...

Deep Transfer Reinforcement Learning for Text Summarization

Deep neural networks are data hungry models and thus they face difficult...

Please sign up or login with your details

Forgot password? Click here to reset