Correlation-based Intrinsic Evaluation of Word Vector Representations

06/21/2016
by   Yulia Tsvetkov, et al.
0

We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations based on correlations of learned vectors with features extracted from linguistic resources. We show that QVEC-CCA scores are an effective proxy for a range of extrinsic semantic and syntactic tasks. We also show that the proposed evaluation obtains higher and more consistent correlations with downstream tasks, compared to existing approaches to intrinsic evaluation of word vectors that are based on word similarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2016

Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

Lacking standardized extrinsic evaluation methods for vector representat...
research
08/25/2019

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

We propose a simple approach to train better Korean word representations...
research
06/25/2016

Intrinsic Subspace Evaluation of Word Embedding Representations

We introduce a new methodology for intrinsic evaluation of word represen...
research
03/05/2022

Just Rank: Rethinking Evaluation with Word and Sentence Similarities

Word and sentence embeddings are useful feature representations in natur...
research
02/05/2017

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Real-valued word representations have transformed NLP applications, popu...
research
06/27/2016

Evaluating Informal-Domain Word Representations With UrbanDictionary

Existing corpora for intrinsic evaluation are not targeted towards tasks...
research
05/11/2020

Evaluating Sparse Interpretable Word Embeddings for Biomedical Domain

Word embeddings have found their way into a wide range of natural langua...

Please sign up or login with your details

Forgot password? Click here to reset