Pitfalls in the Evaluation of Sentence Embeddings

06/04/2019
by   Steffen Eger, et al.
0

Deep learning models continuously break new records across different NLP tasks. At the same time, their success exposes weaknesses of model evaluation. Here, we compile several key pitfalls of evaluation of sentence embeddings, a currently very popular NLP paradigm. These pitfalls include the comparison of embeddings of different sizes, normalization of embeddings, and the low (and diverging) correlations between transfer and probing tasks. Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research. Based on our insights, we also recommend better practices for better future evaluations of sentence embeddings.

READ FULL TEXT
research
04/02/2022

Efficient comparison of sentence embeddings

The domain of natural language processing (NLP), which has greatly evolv...
research
10/29/2019

Sentence Embeddings for Russian NLU

We investigate the performance of sentence embeddings models on several ...
research
10/25/2019

Evaluation of Sentence Representations in Polish

Methods for learning sentence representations have been actively develop...
research
12/10/2021

Analysis and Prediction of NLP Models Via Task Embeddings

Task embeddings are low-dimensional representations that are trained to ...
research
09/16/2022

Negation, Coordination, and Quantifiers in Contextualized Language Models

With the success of contextualized language models, much research explor...
research
01/15/2013

The Expressive Power of Word Embeddings

We seek to better understand the difference in quality of the several pu...
research
01/29/2019

No Training Required: Exploring Random Encoders for Sentence Classification

We explore various methods for computing sentence representations from p...

Please sign up or login with your details

Forgot password? Click here to reset