DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

06/05/2020
by   John M. Giorgi, et al.
0

We present DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations, a self-supervised method for learning universal sentence embeddings that transfer to a wide variety of natural language processing (NLP) tasks. Our objective leverages recent advances in deep metric learning (DML) and has the advantage of being conceptually simple and easy to implement, requiring no specialized architectures or labelled training data. We demonstrate that our objective can be used to pretrain transformers to state-of-the-art performance on SentEval, a popular benchmark for evaluating universal sentence embeddings, outperforming existing supervised, semi-supervised and unsupervised methods. We perform extensive ablations to determine which factors contribute to the quality of the learned embeddings. Our code will be publicly available and can be easily adapted to new datasets or used to embed unseen text.

READ FULL TEXT

page 4

page 7

research
05/05/2017

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

Many modern NLP systems rely on word embeddings, previously trained in a...
research
05/25/2021

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Learning high-quality sentence representations benefits a wide range of ...
research
07/14/2023

Composition-contrastive Learning for Sentence Embeddings

Vector representations of natural language are ubiquitous in search appl...
research
05/25/2023

Efficient Document Embeddings via Self-Contrastive Bregman Divergence Learning

Learning quality document embeddings is a fundamental problem in natural...
research
12/11/2020

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

We propose TabTransformer, a novel deep tabular data modeling architectu...
research
11/25/2015

Towards Universal Paraphrastic Sentence Embeddings

We consider the problem of learning general-purpose, paraphrastic senten...
research
10/02/2017

Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding

Modeling hypernymy, such as poodle is-a dog, is an important generalizat...

Please sign up or login with your details

Forgot password? Click here to reset