Contextualized language models for semantic change detection: lessons learned

08/31/2022
by   Andrey Kutuzov, et al.
0

We present a qualitative analysis of the (potentially erroneous) outputs of contextualized embedding-based methods for detecting diachronic semantic change. First, we introduce an ensemble method outperforming previously described contextualized approaches. This method is used as a basis for an in-depth analysis of the degrees of semantic change predicted for English words across 5 decades. Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift in the lexicographic sense of the term (or at least the status of these shifts is questionable). Such challenging cases are discussed in detail with examples, and their linguistic categorization is proposed. Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in contextual variance, which naturally stem from their distributional nature, but is different from the types of issues observed in methods based on static embeddings. Additionally, they often merge together syntactic and semantic aspects of lexical entities. We propose a range of possible future solutions to these issues.

READ FULL TEXT
research
09/21/2021

Grammatical Profiling for Semantic Change Detection

Semantics, morphology and syntax are strongly interdependent. However, t...
research
08/23/2023

Semantic Change Detection for the Romanian Language

Automatic semantic change methods try to identify the changes that appea...
research
04/12/2022

Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change

Morphological and syntactic changes in word usage (as captured, e.g., by...
research
03/09/2022

Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang

Languages are continuously undergoing changes, and the mechanisms that u...
research
06/07/2019

A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains

We perform an interdisciplinary large-scale evaluation for detecting lex...
research
01/30/2021

Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks

The use of language is subject to variation over time as well as across ...
research
12/02/2020

SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical Semantic Change

This paper describes SChME (Semantic Change Detection with Model Ensembl...

Please sign up or login with your details

Forgot password? Click here to reset