DeepAI AI Chat
Log In Sign Up

How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks

by   Stanisław Jastrzębski, et al.
Jagiellonian University

Maybe the single most important goal of representation learning is making subsequent learning faster. Surprisingly, this fact is not well reflected in the way embeddings are evaluated. In addition, recent practice in word embeddings points towards importance of learning specialized representations. We argue that focus of word representation evaluation should reflect those trends and shift towards evaluating what useful information is easily accessible. Specifically, we propose that evaluation should focus on data efficiency and simple supervised tasks, where the amount of available data is varied and scores of a supervised model are reported for each subset (as commonly done in transfer learning). In order to illustrate significance of such analysis, a comprehensive evaluation of selected word embeddings is presented. Proposed approach yields a more complete picture and brings new insight into performance characteristics, for instance information about word similarity or analogy tends to be non--linearly encoded in the embedding space, which questions the cosine-based, unsupervised, evaluation methods. All results and analysis scripts are available online.


page 1

page 2

page 3

page 4


Semi-Supervised Multi-Task Word Embeddings

Word embeddings have been shown to benefit from ensembling several word ...

MoRTy: Unsupervised Learning of Task-specialized Word Embeddings by Autoencoding

Word embeddings have undoubtedly revolutionized NLP. However, pre-traine...

CogniFNN: A Fuzzy Neural Network Framework for Cognitive Word Embedding Evaluation

Word embeddings can reflect the semantic representations, and the embedd...

Using BERT Embeddings to Model Word Importance in Conversational Transcripts for Deaf and Hard of Hearing Users

Deaf and hard of hearing individuals regularly rely on captioning while ...

CogniVal: A Framework for Cognitive Word Embedding Evaluation

An interesting method of evaluating word representations is by how much ...

Decision-Directed Data Decomposition

We present an algorithm, Decision-Directed Data Decomposition, which dec...

Code Repositories


Package for evaluating word embeddings

view repo