The Limitations of Cross-language Word Embeddings Evaluation

06/06/2018
by   Amir Bakarov, et al.
0

The aim of this work is to explore the possible limitations of existing methods of cross-language word embeddings evaluation, addressing the lack of correlation between intrinsic and extrinsic cross-language evaluation methods. To prove this hypothesis, we construct English-Russian datasets for extrinsic and intrinsic evaluation tasks and compare performances of 5 different cross-language models on them. The results say that the scores even on different intrinsic benchmarks do not correlate to each other. We can conclude that the use of human references as ground truth for cross-language word embeddings is not proper unless one does not understand how do native speakers process semantics in their cognition.

READ FULL TEXT
research
06/07/2019

Word Embeddings for the Armenian Language: Intrinsic and Extrinsic Evaluation

In this work, we intrinsically and extrinsically evaluate and compare ex...
research
09/04/2017

Hypothesis Testing based Intrinsic Evaluation of Word Embeddings

We introduce the cross-match test - an exact, distribution free, high-di...
research
01/21/2018

A Survey of Word Embeddings Evaluation Methods

Word embeddings are real-valued word representations able to capture lex...
research
04/23/2018

Can Eye Movement Data Be Used As Ground Truth For Word Embeddings Evaluation?

In recent years a certain success in the task of modeling lexical semant...
research
02/24/2022

Probing BERT's priors with serial reproduction chains

We can learn as much about language models from what they say as we lear...
research
07/01/2016

Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource

Word embeddings have recently seen a strong increase in interest as a re...
research
02/10/2017

UsingWord Embedding for Cross-Language Plagiarism Detection

This paper proposes to use distributed representation of words (word emb...

Please sign up or login with your details

Forgot password? Click here to reset