Does the Geometry of Word Embeddings Help Document Classification? A Case Study on Persistent Homology Based Representations

05/31/2017
by   Paul Michel, et al.
0

We investigate the pertinence of methods from algebraic topology for text data analysis. These methods enable the development of mathematically-principled isometric-invariant mappings from a set of vectors to a document embedding, which is stable with respect to the geometry of the document in the selected metric space. In this work, we evaluate the utility of these topology-based document representations in traditional NLP tasks, specifically document clustering and sentiment classification. We find that the embeddings do not benefit text analysis. In fact, performance is worse than simple techniques like tf-idf, indicating that the geometry of the document does not provide enough variability for classification on the basis of topic or sentiment in the chosen datasets.

READ FULL TEXT
research
06/01/2020

Hybrid Improved Document-level Embedding (HIDE)

In recent times, word embeddings are taking a significant role in sentim...
research
08/02/2015

Class Vectors: Embedding representation of Document Classes

Distributed representations of words and paragraphs as semantic embeddin...
research
07/29/2015

Document Embedding with Paragraph Vectors

Paragraph Vectors has been recently proposed as an unsupervised method f...
research
09/27/2017

KeyVec: Key-semantics Preserving Document Representations

Previous studies have demonstrated the empirical success of word embeddi...
research
09/06/2023

Synthetic Text Generation using Hypergraph Representations

Generating synthetic variants of a document is often posed as text-to-te...
research
03/29/2020

Topological Data Analysis in Text Classification: Extracting Features with Additive Information

While the strength of Topological Data Analysis has been explored in man...
research
11/04/2019

Spherical Text Embedding

Unsupervised text embedding has shown great power in a wide range of NLP...

Please sign up or login with your details

Forgot password? Click here to reset