Neologisms on Facebook

04/13/2018
by   Nikita Muravyev, et al.
0

In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts. We have analyzed a dataset of several million publically available posts written during 2006-2013 by Russian-speaking Facebook users. From these, we have built a vocabulary of most frequent lemmatized words missing from the OpenCorpora dictionary the assumption being that many such words have entered common use only recently. This assumption is certainly not true for all the words extracted in this way; for that reason, we manually filtered the automatically obtained list in order to exclude non-Russian or incorrectly lemmatized words, as well as words recorded by other dictionaries or those occurring in texts from the Russian National Corpus. The result is a list of 168 words that can potentially be considered neologisms. We present an attempt at an etymological classification of these neologisms (unsurprisingly, most of them have recently been borrowed from English, but there are also quite a few new words composed of previously borrowed stems) and identify various derivational patterns. We also classify words into several large thematic areas, "internet", "marketing", and "multimedia" being among those with the largest number of words. We believe that, together with the word base collected in the process, they can serve as a starting point in further studies of neologisms and lexical processes that lead to their acceptance into the mainstream language.

READ FULL TEXT

page 7

page 8

page 13

page 14

page 15

research
12/14/2019

LScDC-new large scientific dictionary

In this paper, we present a scientific corpus of abstracts of academic p...
research
09/21/2020

Accent Estimation of Japanese Words from Their Surfaces and Romanizations for Building Large Vocabulary Accent Dictionaries

In Japanese text-to-speech (TTS), it is necessary to add accent informat...
research
04/19/2019

Recognizing the vocabulary of Brazilian popular newspapers with a free-access computational dictionary

We report an experiment to check the identification of a set of words in...
research
07/15/2020

Sinhala Language Corpora and Stopwords from a Decade of Sri Lankan Facebook

This paper presents two colloquial Sinhala language corpora from the lan...
research
01/25/2021

A Simple Disaster-Related Knowledge Base for Intelligent Agents

In this paper, we describe our efforts in establishing a simple knowledg...
research
08/23/2018

Sentiment Index of the Russian Speaking Facebook

A sentiment index measures the average emotional level in a corpus. We i...
research
06/13/2022

Automatic generation of a large dictionary with concreteness/abstractness ratings based on a small human dictionary

Concrete/abstract words are used in a growing number of psychological an...

Please sign up or login with your details

Forgot password? Click here to reset