What the Vec? Towards Probabilistically Grounded Embeddings

05/30/2018

∙

Vector representation, or embedding, of words is commonly achieved with neural network methods, in particular word2vec (W2V). It has been shown that certain statistics of word co-occurrences are implicitly captured by properties of W2V vectors, but much remains unknown of them, e.g. any meaning of length, or more generally how it is that statistics can be reliably framed as vectors at all. By deriving a mathematical link between probabilities and vectors, we justify why W2V works and are able to create embeddings with probabilistically interpretable properties.

READ FULL TEXT

What the Vec? Towards Probabilistically Grounded Embeddings

Sign in with Google

Consider DeepAI Pro