WiC: 10,000 Example Pairs for Evaluating Context-Sensitive Representations
By design, word embeddings are unable to model the dynamic nature of words' semantics, i.e., the property of words to correspond to potentially different meanings depending on the context in which they appear. To address this limitation, dozens of specialized word embedding techniques have been proposed. However, despite the popularity of research on this topic, very few evaluation benchmarks exist that specifically focus on the dynamic semantics of words. In this paper we show that existing models have surpassed the performance ceiling for the standard de facto dataset, i.e., the Stanford Contextual Word Similarity. To address the lack of a suitable benchmark, we put forward a large-scale Word in Context dataset, called WiC, based on annotations curated by experts, for generic evaluation of context-sensitive word embeddings. WiC is released in https://pilehvar.github.io/wic/.
READ FULL TEXT