AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. The evolution of these models started with GPT and BERT. These models are built on the top of transformers, self-supervised learning and transfer learning. Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning and transfer this knowledge to downstream tasks. These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch. In this comprehensive survey paper, we initially give a brief overview of self-supervised learning. Next, we explain various core concepts like pretraining, pretraining methods, pretraining tasks, embeddings and downstream adaptation methods. Next, we present a new taxonomy of T-PTLMs and then give brief overview of various benchmarks including both intrinsic and extrinsic. We present a summary of various useful libraries to work with T-PTLMs. Finally, we highlight some of the future research directions which will further improve these models. We strongly believe that this comprehensive survey paper will serve as a good reference to learn the core concepts as well as to stay updated with the recent happenings in T-PTLMs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2021

SiT: Self-supervised vIsion Transformer

Self-supervised learning methods are gaining increasing traction in comp...
research
11/30/2021

MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning

Self-supervised pretraining is the method of choice for natural language...
research
02/18/2023

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

The Pretrained Foundation Models (PFMs) are regarded as the foundation f...
research
07/13/2022

DSPNet: Towards Slimmable Pretrained Networks based on Discriminative Self-supervised Learning

Self-supervised learning (SSL) has achieved promising downstream perform...
research
04/11/2023

A surprisingly simple technique to control the pretraining bias for better transfer: Expand or Narrow your representation

Self-Supervised Learning (SSL) models rely on a pretext task to learn re...
research
05/16/2022

Manifold Characteristics That Predict Downstream Task Performance

Pretraining methods are typically compared by evaluating the accuracy of...
research
09/03/2022

TransPolymer: a Transformer-based Language Model for Polymer Property Predictions

Accurate and efficient prediction of polymer properties is of great sign...

Please sign up or login with your details

Forgot password? Click here to reset