Relative Position Prediction as Pre-training for Text Encoders

Meaning is defined by the company it keeps. However, company is two-fold: It's based on the identity of tokens and also on their position (topology). We argue that a position-centric perspective is more general and useful. The classic MLM and CLM objectives in NLP are easily phrased as position predictions over the whole vocabulary. Adapting the relative position encoding paradigm in NLP to create relative labels for self-supervised learning, we seek to show superior pre-training judged by performance on downstream tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2022

Improve Transformer Pre-Training with Decoupled Directional Relative Position Encoding and Representation Differentiations

In this work, we revisit the Transformer-based pre-trained language mode...
research
01/31/2021

Adversarial Contrastive Pre-training for Protein Sequences

Recent developments in Natural Language Processing (NLP) demonstrate tha...
research
04/17/2022

On Effectively Learning of Knowledge in Continual Pre-training

Pre-trained language models (PLMs) like BERT have made significant progr...
research
09/07/2022

Blessing of Class Diversity in Pre-training

This paper presents a new statistical analysis aiming to explain the rec...
research
06/10/2022

Position Labels for Self-Supervised Vision Transformer

Position encoding is important for vision transformer (ViT) to capture t...
research
11/18/2022

Improved Cross-view Completion Pre-training for Stereo Matching

Despite impressive performance for high-level downstream tasks, self-sup...
research
01/28/2021

Position, Padding and Predictions: A Deeper Look at Position Information in CNNs

In contrast to fully connected networks, Convolutional Neural Networks (...

Please sign up or login with your details

Forgot password? Click here to reset