Towards Understanding Linear Word Analogies

10/11/2018
by   Kawin Ethayarajh, et al.
0

A surprising property of word vectors is that vector algebra can often be used to solve word analogies. However, it is unclear why -- and when -- linear operators correspond to non-linear embedding models such as skip-gram with negative sampling (SGNS). We provide a rigorous explanation of this phenomenon without making the strong assumptions that past work has made about the vector space and word distribution. Our theory has several implications. Past work has often conjectured that linear structures exist in vector spaces because relations can be represented as ratios; we prove that this holds for SGNS. We provide novel theoretical justification for the addition of SGNS word vectors by showing that it automatically down-weights the more frequent word, as weighting schemes do ad hoc. Lastly, we offer an information theoretic interpretation of Euclidean distance in vector spaces, providing rigorous justification for its use in capturing word dissimilarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2019

Neural Vector Conceptualization for Word Vector Space Interpretation

Distributed word vector spaces are considered hard to interpret which hi...
research
08/18/2018

SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors

We present SeVeN (Semantic Vector Networks), a hybrid resource that enco...
research
10/24/2020

Word2vec Conjecture and A Limitative Result

Being inspired by the success of word2vec <cit.> in capturing analogies,...
research
02/08/2019

Humor in Word Embeddings: Cockamamie Gobbledegook for Nincompoops

We study humor in Word Embeddings, a popular AI tool that associates eac...
research
03/02/2016

Counter-fitting Word Vectors to Linguistic Constraints

In this work, we present a novel counter-fitting method which injects an...
research
06/01/2017

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

Morphologically rich languages accentuate two properties of distribution...
research
09/02/2019

Rotate King to get Queen: Word Relationships as Orthogonal Transformations in Embedding Space

A notable property of word embeddings is that word relationships can exi...

Please sign up or login with your details

Forgot password? Click here to reset