Downsampling Strategies are Crucial for Word Embedding Reliability

08/21/2018
by   Johannes Hellrich, et al.
0

The reliability of word embeddings algorithms, i.e., their ability to provide consistent computational judgments of word similarity when trained repeatedly on the same data set, has recently raised concerns. We compared the effect of probabilistic and weighting as downsampling strategies. We found the latter to provide superior reliability while being competitive in accuracy when applied to singular value decomposition-based embeddings

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2020

Blind signal decomposition of various word embeddings based on join and individual variance explained

In recent years, natural language processing (NLP) has become one of the...
research
11/22/2020

DiaLex: A Benchmark for Evaluating Multidialectal Arabic Word Embeddings

Word embeddings are a core component of modern natural language processi...
research
09/10/2021

Assessing the Reliability of Word Embedding Gender Bias Measures

Various measures have been proposed to quantify human-like social biases...
research
06/16/2022

TransDrift: Modeling Word-Embedding Drift using Transformer

In modern NLP applications, word embeddings are a crucial backbone that ...
research
10/21/2020

PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding

We look into the task of generalizing word embeddings: given a set of pr...
research
08/16/2015

A Generative Word Embedding Model and its Low Rank Positive Semidefinite Solution

Most existing word embedding methods can be categorized into Neural Embe...
research
04/27/2022

Extremal GloVe: Theoretically Accurate Distributed Word Embedding by Tail Inference

Distributed word embeddings such as Word2Vec and GloVe have been widely ...

Please sign up or login with your details

Forgot password? Click here to reset