Clustering and Network Analysis for the Embedding Spaces of Sentences and Sub-Sentences

10/02/2021
by   Yuan An, et al.
0

Sentence embedding methods offer a powerful approach for working with short textual constructs or sequences of words. By representing sentences as dense numerical vectors, many natural language processing (NLP) applications have improved their performance. However, relatively little is understood about the latent structure of sentence embeddings. Specifically, research has not addressed whether the length and structure of sentences impact the sentence embedding space and topology. This paper reports research on a set of comprehensive clustering and network analyses targeting sentence and sub-sentence embedding spaces. Results show that one method generates the most clusterable embeddings. In general, the embeddings of span sub-sentences have better clustering properties than the original sentences. The results have implications for future sentence embedding models and applications.

READ FULL TEXT
research
09/23/2020

A Comparative Study on Structural and Semantic Properties of Sentence Embeddings

Sentence embeddings encode natural language sentences as low-dimensional...
research
05/24/2023

Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations

Traditional sentence embedding models encode sentences into vector repre...
research
12/03/2019

COSTRA 1.0: A Dataset of Complex Sentence Transformations

We present COSTRA 1.0, a dataset of complex sentence transformations. Th...
research
02/18/2022

Evaluating the Construct Validity of Text Embeddings with Application to Survey Questions

Text embedding models from Natural Language Processing can map text data...
research
06/04/2019

Towards Lossless Encoding of Sentences

A lot of work has been done in the field of image compression via machin...
research
04/22/2021

Universal Horn Sentences and the Joint Embedding Property

The finite models of a universal sentence Φ are the age of a structure i...
research
05/19/2022

Sentences as connection paths: A neural language architecture of sentence structure in the brain

This article presents a neural language architecture of sentence structu...

Please sign up or login with your details

Forgot password? Click here to reset