What do you mean, BERT? Assessing BERT as a Distributional Semantics Model

11/13/2019
by   Timothee Mickus, et al.
0

Contextualized word embeddings, i.e. vector representations for words in context, are naturally seen as an extension of previous noncontextual distributional semantic models. In this work, we focus on BERT, a deep neural network that produces contextualized embeddings and has set the state-of-the-art in several semantic tasks, and study the semantic coherence of its embedding space. While showing a tendency towards coherence, BERT does not fully live up to the natural expectations for a semantic vector space. In particular, we find that the position of the sentence in which a word occurs, while having no meaning correlates, leaves a noticeable trace on the word embeddings and disturbs similarity relationships.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2015

AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes

We present AutoExtend, a system to learn embeddings for synsets and lexe...
research
12/30/2020

Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings

Models based on the transformer architecture, such as BERT, have marked ...
research
03/19/2022

From meaning to perception – exploring the space between word and odor perception embeddings

In this paper we propose the use of the Word2vec algorithm in order to o...
research
10/21/2022

Discovering Differences in the Representation of People using Contextualized Semantic Axes

A common paradigm for identifying semantic differences across social and...
research
05/29/2023

A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces

We study semantic construal in grammatical constructions using large lan...
research
02/14/2023

A Psycholinguistic Analysis of BERT's Representations of Compounds

This work studies the semantic representations learned by BERT for compo...
research
04/17/2021

Frequency-based Distortions in Contextualized Word Embeddings

How does word frequency in pre-training data affect the behavior of simi...

Please sign up or login with your details

Forgot password? Click here to reset