InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

10/08/2022
by   Xing Wu, et al.
0

Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer. The constraint brought by this assumption is weak, and a good sentence representation should also be able to reconstruct the original sentence fragments. Therefore, this paper proposes an information-aggregated contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. InfoCSE forces the representation of [CLS] positions to aggregate denser sentence information by introducing an additional Masked language model task and a well-designed network. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60 state-of-the-art results among unsupervised sentence representation learning methods. Our code are available at https://github.com/caskcsg/sentemb/tree/main/InfoCSE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2023

Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding

Contrastive learning-based methods, such as unsup-SimCSE, have achieved ...
research
12/18/2022

On Isotropy and Learning Dynamics of Contrastive-based Sentence Representation Learning

Incorporating contrastive learning objectives in sentence representation...
research
06/10/2022

Unsupervised Sentence Simplification via Dependency Parsing

Text simplification is the task of rewriting a text so that it is readab...
research
03/14/2022

Deep Continuous Prompt for Contrastive Learning of Sentence Embeddings

The performance of sentence representation has been remarkably improved ...
research
02/26/2022

Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding

Contrastive learning is emerging as a powerful technique for extracting ...
research
05/15/2023

Unsupervised Sentence Representation Learning with Frequency-induced Adversarial Tuning and Incomplete Sentence Filtering

Pre-trained Language Model (PLM) is nowadays the mainstay of Unsupervise...
research
09/02/2021

Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning

Though language model text embeddings have revolutionized NLP research, ...

Please sign up or login with your details

Forgot password? Click here to reset