Just Rank: Rethinking Evaluation with Word and Sentence Similarities

03/05/2022
by   Bin Wang, et al.
0

Word and sentence embeddings are useful feature representations in natural language processing. However, intrinsic evaluation for embeddings lags far behind, and there has been no significant update since the past decade. Word and sentence similarity tasks have become the de facto evaluation method. It leads models to overfit to such evaluations, negatively impacting embedding models' development. This paper first points out the problems using semantic similarity as the gold standard for word and sentence embedding evaluations. Further, we propose a new intrinsic evaluation method called EvalRank, which shows a much stronger correlation with downstream tasks. Extensive experiments are conducted based on 60+ models and popular datasets to certify our judgments. Finally, the practical evaluation toolkit is released for future benchmarking purposes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2020

SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models

Sentence embedding is an important research topic in natural language pr...
research
06/16/2018

Evaluation of sentence embeddings in downstream and linguistic probing tasks

Despite the fast developmental pace of new sentence embedding methods, i...
research
06/21/2016

Correlation-based Intrinsic Evaluation of Word Vector Representations

We introduce QVEC-CCA--an intrinsic evaluation metric for word vector re...
research
06/25/2016

Intrinsic Subspace Evaluation of Word Embedding Representations

We introduce a new methodology for intrinsic evaluation of word represen...
research
03/14/2018

SentEval: An Evaluation Toolkit for Universal Sentence Representations

We introduce SentEval, a toolkit for evaluating the quality of universal...
research
08/25/2019

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

We propose a simple approach to train better Korean word representations...
research
08/15/2017

Evaluating Word Embeddings for Sentence Boundary Detection in Speech Transcripts

This paper is motivated by the automation of neuropsychological tests in...

Please sign up or login with your details

Forgot password? Click here to reset