Pre-Trained Models: Past, Present and Future

06/14/2021
by   Han Xu, et al.
6

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI). Owing to sophisticated pre-training objectives and huge model parameters, large-scale PTMs can effectively capture knowledge from massive labeled and unlabeled data. By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks, which has been extensively demonstrated via experimental verification and empirical analysis. It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch. In this paper, we take a deep look into the history of pre-training, especially its special relation with transfer learning and self-supervised learning, to reveal the crucial position of PTMs in the AI development spectrum. Further, we comprehensively review the latest breakthroughs of PTMs. These breakthroughs are driven by the surge of computational power and the increasing availability of data, towards four important directions: designing effective architectures, utilizing rich contexts, improving computational efficiency, and conducting interpretation and theoretical analysis. Finally, we discuss a series of open problems and research directions of PTMs, and hope our view can inspire and advance the future study of PTMs.

READ FULL TEXT

page 2

page 8

page 9

page 19

research
02/20/2022

𝒴-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning

With the success of large-scale pre-trained models (PTMs), how efficient...
research
09/08/2021

On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets

Pre-training language models (LMs) on large-scale unlabeled text data ma...
research
02/20/2023

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

With the urgent demand for generalized deep models, many pre-trained big...
research
11/04/2022

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

The advent of hyper-scale and general-purpose pre-trained models is shif...
research
10/28/2021

10 Security and Privacy Problems in Self-Supervised Learning

Self-supervised learning has achieved revolutionary progress in the past...
research
05/28/2021

Knowledge Inheritance for Pre-trained Language Models

Recent explorations of large-scale pre-trained language models (PLMs) su...
research
04/08/2023

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

With the increasing data volume, there is a trend of using large-scale p...

Please sign up or login with your details

Forgot password? Click here to reset