Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

05/08/2016
by   Manaal Faruqui, et al.
0

Lacking standardized extrinsic evaluation methods for vector representations of words, the NLP community has relied heavily on word similarity tasks as a proxy for intrinsic evaluation of word vectors. Word similarity evaluation, which correlates the distance between vectors and human judgments of semantic similarity is attractive, because it is computationally inexpensive and fast. In this paper we present several problems associated with the evaluation of word vectors on word similarity datasets, and summarize existing solutions. Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2016

Correlation-based Intrinsic Evaluation of Word Vector Representations

We introduce QVEC-CCA--an intrinsic evaluation metric for word vector re...
research
08/22/2018

Deep Extrofitting: Specialization and Generalization of Expansional Retrofitting Word Vectors using Semantic Lexicons

The retrofitting techniques, which inject external resources into word r...
research
03/15/2018

Word2Bits - Quantized Word Vectors

Word vectors require significant amounts of memory and storage, posing i...
research
04/30/2019

Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors

Recent literature suggests that averaged word vectors followed by simple...
research
02/05/2017

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Real-valued word representations have transformed NLP applications, popu...
research
04/30/2020

Word Rotator's Distance: Decomposing Vectors Gives Better Representations

One key principle for assessing semantic similarity between texts is to ...
research
11/11/2016

Improving Reliability of Word Similarity Evaluation by Redesigning Annotation Task and Performance Measure

We suggest a new method for creating and using gold-standard datasets fo...

Please sign up or login with your details

Forgot password? Click here to reset