Controlled Experiments for Word Embeddings

10/09/2015
by   Benjamin J. Wilson, et al.
0

An experimental approach to studying the properties of word embeddings is proposed. Controlled experiments, achieved through modifications of the training corpus, permit the demonstration of direct relations between word properties and word vector direction and length. The approach is demonstrated using the word2vec CBOW model with experiments that independently vary word frequency and word co-occurrence noise. The experiments reveal that word vector length depends more or less linearly on both word frequency and the level of noise in the co-occurrence distribution of the word. The coefficients of linearity depend upon the word. The special point in feature space, defined by the (artificial) word with pure noise in its co-occurrence distribution, is found to be small but non-zero.

READ FULL TEXT

page 8

page 12

research
10/06/2016

Neural-based Noise Filtering from Word Embeddings

Word embeddings have been demonstrated to benefit NLP tasks impressively...
research
05/25/2018

UMDuluth-CS8761 at SemEval-2018 Task 9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings

Hypernym Discovery is the task of identifying potential hypernyms for a ...
research
10/03/2019

Complex networks based word embeddings

Most of the time, the first step to learn word embeddings is to build a ...
research
12/30/2020

SemGloVe: Semantic Co-occurrences for GloVe from BERT

GloVe learns word embeddings by leveraging statistical information from ...
research
12/14/2019

Integrating Lexical Knowledge in Word Embeddings using Sprinkling and Retrofitting

Neural network based word embeddings, such as Word2Vec and GloVe, are pu...
research
05/18/2020

Reconstructing Maps from Text

Previous research has demonstrated that Distributional Semantic Models (...
research
06/05/2019

KAS-term: Extracting Slovene Terms from Doctoral Theses via Supervised Machine Learning

This paper presents a dataset and supervised learning experiments for te...

Please sign up or login with your details

Forgot password? Click here to reset