Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning

02/26/2022
by   Seonghyeon Lee, et al.
0

Recently, finetuning a pretrained language model to capture the similarity between sentence embeddings has shown the state-of-the-art performance on the semantic textual similarity (STS) task. However, the absence of an interpretation method for the sentence similarity makes it difficult to explain the model output. In this work, we explicitly describe the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem, and then present the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs. In the end, we propose CLRCMD, a contrastive learning framework that optimizes RCMD of sentence pairs, which enhances the quality of sentence similarity and their interpretation. Extensive experiments demonstrate that our learning framework outperforms other baselines on both STS and interpretable-STS benchmarks, indicating that it computes effective sentence similarity and also provides interpretation consistent with human judgement.

READ FULL TEXT
research
06/16/2023

CMLM-CSE: Based on Conditional MLM Contrastive Learning for Sentence Embeddings

Traditional comparative learning sentence embedding directly uses the en...
research
03/11/2022

A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings

Contrastive learning has shown great potential in unsupervised sentence ...
research
10/05/2022

Unsupervised Sentence Textual Similarity with Compositional Phrase Semantics

Measuring Sentence Textual Similarity (STS) is a classic task that can b...
research
09/02/2021

Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning

Though language model text embeddings have revolutionized NLP research, ...
research
01/28/2020

Structural-Aware Sentence Similarity with Recursive Optimal Transport

Measuring sentence similarity is a classic topic in natural language pro...
research
10/30/2021

TransAug: Translate as Augmentation for Sentence Embeddings

While contrastive learning greatly advances the representation of senten...
research
04/18/2023

D2CSE: Difference-aware Deep continuous prompts for Contrastive Sentence Embeddings

This paper describes Difference-aware Deep continuous prompt for Contras...

Please sign up or login with your details

Forgot password? Click here to reset