A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity

06/05/2019
by   Yoshinari Fujinuma, et al.
0

Cross-lingual word embeddings encode the meaning of words from different languages into a shared low-dimensional space. An important requirement for many downstream tasks is that word similarity should be independent of language - i.e., word vectors within one language should not be more similar to each other than to words in another language. We measure this characteristic using modularity, a network measurement that measures the strength of clusters in a graph. Modularity has a moderate to strong correlation with three downstream tasks, even though modularity is based only on the structure of embeddings and does not require any external resources. We show through experiments that modularity can serve as an intrinsic validation metric to improve unsupervised cross-lingual word embeddings, particularly on distant language pairs in low-resource settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2021

Training Cross-Lingual embeddings for Setswana and Sepedi

African languages still lag in the advances of Natural Language Processi...
research
04/01/2016

Cross-lingual Models of Word Embeddings: An Empirical Comparison

Despite interest in using cross-lingual knowledge to learn word embeddin...
research
05/01/2020

Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries

Cross-lingual word embeddings (CLWE) are often evaluated on bilingual le...
research
06/02/2021

Evaluating Word Embeddings with Categorical Modularity

We introduce categorical modularity, a novel low-resource intrinsic metr...
research
10/06/2017

Low-resource bilingual lexicon extraction using graph based word embeddings

In this work we focus on the task of automatically extracting bilingual ...
research
02/01/2019

How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions

Cross-lingual word embeddings (CLEs) enable multilingual modeling of mea...
research
11/08/2019

Interactive Refinement of Cross-Lingual Word Embeddings

Cross-lingual word embeddings transfer knowledge between languages: mode...

Please sign up or login with your details

Forgot password? Click here to reset