Semantic Structure and Interpretability of Word Embeddings

11/01/2017
by   Lutfi Kerem Senel, et al.
0

Dense word embeddings, which encode semantic meanings of words to low dimensional vector spaces have become very popular in natural language processing (NLP) research due to their state-of-the-art performances in many NLP tasks. Word embeddings are substantially successful in capturing semantic relations among words, so a meaningful semantic structure must be present in the respective vector spaces. However, in many cases, this semantic structure is broadly and heterogeneously distributed across the embedding dimensions, which makes interpretation a big challenge. In this study, we propose a statistical method to uncover the latent semantic structure in the dense word embeddings. To perform our analysis we introduce a new dataset (SEMCAT) that contains more than 6500 words semantically grouped under 110 categories. We further propose a method to quantify the interpretability of the word embeddings; the proposed method is a practical alternative to the classical word intrusion test that requires human intervention.

READ FULL TEXT

page 6

page 8

research
07/19/2018

Imparting Interpretability to Word Embeddings

As an ubiquitous method in natural language processing, word embeddings ...
research
07/23/2020

Word Embeddings: Stability and Semantic Change

Word embeddings are computed by a class of techniques within natural lan...
research
06/23/2020

Supervised Understanding of Word Embeddings

Pre-trained word embeddings are widely used for transfer learning in nat...
research
02/08/2021

Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds

Unsupervised representation learning techniques, such as learning word e...
research
12/02/2020

On Extending NLP Techniques from the Categorical to the Latent Space: KL Divergence, Zipf's Law, and Similarity Search

Despite the recent successes of deep learning in natural language proces...
research
02/05/2018

Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

The words of a language reflect the structure of the human mind, allowin...
research
11/19/2017

Intelligent Word Embeddings of Free-Text Radiology Reports

Radiology reports are a rich resource for advancing deep learning applic...

Please sign up or login with your details

Forgot password? Click here to reset