Effective Pre-Training Objectives for Transformer-based Autoencoders

10/24/2022
by   Luca Di Liello, et al.
0

In this paper, we study trade-offs between efficiency, cost and accuracy when pre-training Transformer encoders with different pre-training objectives. For this purpose, we analyze features of common objectives and combine them to create new effective pre-training approaches. Specifically, we designed light token generators based on a straightforward statistical approach, which can replace ELECTRA computationally heavy generators, thus highly reducing cost. Our experiments also show that (i) there are more efficient alternatives to BERT's MLM, and (ii) it is possible to efficiently pre-train Transformer-based models using lighter generators without a significant drop in performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2021

Efficient pre-training objectives for Transformers

The Transformer architecture deeply changed the natural language process...
research
11/03/2021

An Empirical Study of Training End-to-End Vision-and-Language Transformers

Vision-and-language (VL) pre-training has proven to be highly effective ...
research
02/24/2023

MUX-PLMs: Pre-training Language Models with Data Multiplexing

Data multiplexing is a recently proposed method for improving a model's ...
research
09/15/2023

Structural Self-Supervised Objectives for Transformers

This thesis focuses on improving the pre-training of natural language mo...
research
05/11/2021

Benchmarking down-scaled (not so large) pre-trained language models

Large Transformer-based language models are pre-trained on corpora of va...
research
12/19/2021

On Efficient Transformer and Image Pre-training for Low-level Vision

Pre-training has marked numerous state of the arts in high-level compute...
research
12/27/2021

ViR:the Vision Reservoir

The most recent year has witnessed the success of applying the Vision Tr...

Please sign up or login with your details

Forgot password? Click here to reset