A Critique of a Critique of Word Similarity Datasets: Sanity Check or Unnecessary Confusion?

07/12/2017
by   Minh Le, et al.
0

Critical evaluation of word similarity datasets is very important for computational lexical semantics. This short report concerns the sanity check proposed in Batchkarov et al. (2016) to evaluate several popular datasets such as MC, RG and MEN -- the first two reportedly failed. I argue that this test is unstable, offers no added insight, and needs major revision in order to fulfill its purported goal.

READ FULL TEXT

page 1

page 2

page 3

research
06/01/2019

COS960: A Chinese Word Similarity Dataset of 960 Word Pairs

Word similarity computation is a widely recognized task in the field of ...
research
11/14/2018

A Deterministic Algorithm for Bridging Anaphora Resolution

Previous work on bridging anaphora resolution (Poesio et al., 2004; Hou ...
research
12/13/2021

Context vs Target Word: Quantifying Biases in Lexical Semantic Datasets

State-of-the-art contextualized models such as BERT use tasks such as Wi...
research
04/11/2018

English Out-of-Vocabulary Lexical Evaluation Task

Unlike previous unknown nouns tagging task (Curran, 2005) (Ciaramita and...
research
04/23/2018

Can Eye Movement Data Be Used As Ground Truth For Word Embeddings Evaluation?

In recent years a certain success in the task of modeling lexical semant...
research
05/15/2018

Unsupervised Learning of Style-sensitive Word Vectors

This paper presents the first study aimed at capturing stylistic similar...
research
01/25/2020

Reducing Noise from Competing Neighbours: Word Retrieval with Lateral Inhibition in Multilink

Multilink is a computational model for word retrieval in monolingual and...

Please sign up or login with your details

Forgot password? Click here to reset