Analyzing the Surprising Variability in Word Embedding Stability Across Languages

04/30/2020
by   Laura Burdick, et al.
0

Word embeddings are powerful representations that form the foundation of many natural language processing architectures and tasks, both in English and in other languages. To gain further insight into word embeddings in multiple languages, we explore their stability, defined as the overlap between the nearest neighbors of a word in different embedding spaces. We discuss linguistic properties that are related to stability, drawing out insights about how morphological and other features relate to stability. This has implications for the usage of embeddings, particularly in research that uses embeddings to study language trends.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2018

Factors Influencing the Surprising Instability of Word Embeddings

Despite the recent popularity of word embedding methods, there is only a...
research
05/08/2020

Comparative Analysis of Word Embeddings for Capturing Word Similarities

Distributed language representation has become the most widely used tech...
research
03/22/2019

LINSPECTOR: Multilingual Probing Tasks for Word Representations

Despite an ever growing number of word representation models introduced ...
research
10/22/2020

On the Effects of Using word2vec Representations in Neural Networks for Dialogue Act Recognition

Dialogue act recognition is an important component of a large number of ...
research
09/19/2020

Word class flexibility: A deep contextualized approach

Word class flexibility refers to the phenomenon whereby a single word fo...
research
10/06/2020

Compositional Demographic Word Embeddings

Word embeddings are usually derived from corpora containing text from ma...
research
11/29/2016

Geometry of Compositionality

This paper proposes a simple test for compositionality (i.e., literal us...

Please sign up or login with your details

Forgot password? Click here to reset