Are Word Embedding Methods Stable and Should We Care About It?

04/17/2021
by   Angana Borah, et al.
0

A representation learning method is considered stable if it consistently generates similar representation of the given data across multiple runs. Word Embedding Methods (WEMs) are a class of representation learning methods that generate dense vector representation for each word in the given text data. The central idea of this paper is to explore the stability measurement of WEMs using intrinsic evaluation based on word similarity. We experiment with three popular WEMs: Word2Vec, GloVe, and fastText. For stability measurement, we investigate the effect of five parameters involved in training these models. We perform experiments using four real-world datasets from different domains: Wikipedia, News, Song lyrics, and European parliament proceedings. We also observe the effect of WEM stability on three downstream tasks: Clustering, POS tagging, and Fairness evaluation. Our experiments indicate that amongst the three WEMs, fastText is the most stable, followed by GloVe and Word2Vec.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2023

Word Embedding with Neural Probabilistic Prior

To improve word representation learning, we propose a probabilistic prio...
research
06/07/2019

Word Embeddings for the Armenian Language: Intrinsic and Extrinsic Evaluation

In this work, we intrinsically and extrinsically evaluate and compare ex...
research
05/06/2018

Russian word sense induction by clustering averaged word embeddings

The paper reports our participation in the shared task on word sense ind...
research
07/17/2022

Representation Learning of Image Schema

Image schema is a recurrent pattern of reasoning where one entity is map...
research
07/09/2020

Principal Word Vectors

We generalize principal component analysis for embedding words into a ve...
research
01/18/2021

Alignment and stability of embeddings: measurement and inference improvement

Representation learning (RL) methods learn objects' latent embeddings wh...

Please sign up or login with your details

Forgot password? Click here to reset