Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders

09/28/2021
by   Fangyu Liu, et al.
0

Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years. However, previous work has indicated that off-the-shelf MLMs are not effective as universal lexical or sentence encoders without further task-specific fine-tuning on NLI, sentence similarity, or paraphrasing tasks using annotated task data. In this work, we demonstrate that it is possible to turn MLMs into effective universal lexical and sentence encoders even without any additional data and without any supervision. We propose an extremely simple, fast and effective contrastive learning technique, termed Mirror-BERT, which converts MLMs (e.g., BERT and RoBERTa) into such encoders in 20-30 seconds without any additional external knowledge. Mirror-BERT relies on fully identical or slightly modified string pairs as positive (i.e., synonymous) fine-tuning examples, and aims to maximise their similarity during identity fine-tuning. We report huge gains over off-the-shelf MLMs with Mirror-BERT in both lexical-level and sentence-level tasks, across different domains and different languages. Notably, in the standard sentence semantic similarity (STS) tasks, our self-supervised Mirror-BERT model even matches the performance of the task-tuned Sentence-BERT models from prior work. Finally, we delve deeper into the inner workings of MLMs, and suggest some evidence on why this simple approach can yield effective universal lexical and sentence encoders.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2021

Fast, Effective and Self-Supervised: Transforming Masked LanguageModels into Universal Lexical and Sentence Encoders

Pretrained Masked Language Models (MLMs) have revolutionised NLP in rece...
research
09/19/2021

MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models

Recent work indicated that pretrained language models (PLMs) such as BER...
research
09/21/2021

ConvFiT: Conversational Fine-Tuning of Pretrained Language Models

Transformer-based language models (LMs) pretrained on large text collect...
research
01/27/2021

On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

The adaptation of pretrained language models to solve supervised tasks h...
research
05/12/2021

OCHADAI-KYODAI at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

We propose an ensemble model for predicting the lexical complexity of wo...
research
02/17/2022

SGPT: GPT Sentence Embeddings for Semantic Search

GPT transformers are the largest language models available, yet semantic...
research
04/27/2022

UBERT: A Novel Language Model for Synonymy Prediction at Scale in the UMLS Metathesaurus

The UMLS Metathesaurus integrates more than 200 biomedical source vocabu...

Please sign up or login with your details

Forgot password? Click here to reset