Comparison and Combination of Sentence Embeddings Derived from Different Supervision Signals

02/07/2022
by   Hayato Tsukagoshi, et al.
0

We have recently seen many successful applications of sentence embedding methods. It has not been well understood, however, what kind of properties are captured in the resulting sentence embeddings, depending on the supervision signals. In this paper, we focus on two types of sentence embeddings obtained by using natural language inference (NLI) datasets and definition sentences from a word dictionary and investigate their properties by comparing their performance with the semantic textual similarity (STS) task using the STS data partitioned by two perspectives: 1) the sources of sentences, and 2) the superficial similarity of the sentence pairs, and their performance on the downstream and probing tasks. We also demonstrate that combining the two types of embeddings yields substantially better performances than respective models on unsupervised STS tasks and downstream tasks.

READ FULL TEXT

page 4

page 6

research
05/10/2021

DefSent: Sentence Embeddings using Definition Sentences

Sentence embedding methods using natural language inference (NLI) datase...
research
10/05/2021

Exploiting Twitter as Source of Large Corpora of Weakly Similar Pairs for Semantic Sentence Embeddings

Semantic sentence embeddings are usually supervisedly built minimizing d...
research
11/10/2019

A Bilingual Generative Transformer for Semantic Sentence Embedding

Semantic sentence embedding models encode natural language sentences int...
research
04/03/2019

The Effect of Downstream Classification Tasks for Evaluating Sentence Embeddings

One popular method for quantitatively evaluating the performance of sent...
research
06/04/2019

Towards Lossless Encoding of Sentences

A lot of work has been done in the field of image compression via machin...
research
04/30/2019

Model Comparison for Semantic Grouping

We introduce a probabilistic framework for quantifying the semantic simi...
research
11/15/2017

Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

We extend the work of Wieting et al. (2017), back-translating a large pa...

Please sign up or login with your details

Forgot password? Click here to reset