DeepAI AI Chat
Log In Sign Up

How COVID-19 Is Changing Our Language : Detecting Semantic Shift in Twitter Word Embeddings

02/15/2021
by   Yanzhu Guo, et al.
0

Words are malleable objects, influenced by events that are reflected in written texts. Situated in the global outbreak of COVID-19, our research aims at detecting semantic shifts in social media language triggered by the health crisis. With COVID-19 related big data extracted from Twitter, we train separate word embedding models for different time periods after the outbreak. We employ an alignment-based approach to compare these embeddings with a general-purpose Twitter embedding unrelated to COVID-19. We also compare our trained embeddings among them to observe diachronic evolution. Carrying out case studies on a set of words chosen by topic detection, we verify that our alignment approach is valid. Finally, we quantify the size of global semantic shift by a stability measure based on back-and-forth rotational alignment.

READ FULL TEXT
01/09/2021

Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter

The rapid growth of social media content during the current pandemic pro...
11/08/2020

Detecting Emerging Symptoms of COVID-19 using Context-based Twitter Embeddings

In this paper, we present an iterative graph-based approach for the dete...
04/17/2021

Combating Temporal Drift in Crisis with Adapted Embeddings

Language usage changes over time, and this can impact the effectiveness ...
05/29/2018

Unsupervised detection of diachronic word sense evolution

Most words have several senses and connotations which evolve in time due...
12/12/2018

The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation

Language is dynamic, constantly evolving and adapting with respect to ti...
05/16/2019

Tracing cultural diachronic semantic shifts in Russian using word embeddings: test sets and baselines

The paper introduces manually annotated test sets for the task of tracin...
09/18/2018

Fighting Redundancy and Model Decay with Embeddings

Every day, hundreds of millions of new Tweets containing over 40 languag...