Linguistic Taboos and Euphemisms in Nepali

07/27/2020
by   Nobal B. Niraula, et al.
0

Languages across the world have words, phrases, and behaviors – the taboos – that are avoided in public communication considering them as obscene or disturbing to the social, religious, and ethical values of society. However, people deliberately use these linguistic taboos and other language constructs to make hurtful, derogatory, and obscene comments. It is nearly impossible to construct a universal set of offensive or taboo terms because offensiveness is determined entirely by different factors such as socio-physical setting, speaker-listener relationship, and word choices. In this paper, we present a detailed corpus-based study of offensive language in Nepali. We identify and describe more than 18 different categories of linguistic offenses including politics, religion, race, and sex. We discuss 12 common euphemisms such as synonym, metaphor and circumlocution. In addition, we introduce a manually constructed data set of over 1000 offensive and taboo terms popular among contemporary speakers. This in-depth study of offensive language and resource will provide a foundation for several downstream tasks such as offensive language detection and language learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2022

Analysis of Male and Female Speakers' Word Choices in Public Speeches

The extent to which men and women use language differently has been ques...
research
09/15/2022

Corpus-Guided Contrast Sets for Morphosyntactic Feature Detection in Low-Resource English Varieties

The study of language variation examines how language varies between and...
research
10/27/2021

Can Linguistic Distance help Language Classification? Assessing Hawrami-Zaza and Kurmanji-Sorani

To consider Hawrami and Zaza (Zazaki) standalone languages or dialects o...
research
02/13/2021

The first large scale collection of diverse Hausa language datasets

Hausa language belongs to the Afroasiatic phylum, and with more first-la...
research
11/06/2018

WordNet-feelings: A linguistic categorisation of human feelings

In this article, we present the first in depth linguistic study of human...
research
01/15/2022

KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics

We present an expanded version of our previously released Kazakh text-to...
research
11/05/2018

A personal model of trumpery: Deception detection in a real-world high-stakes setting

Language use reveals information about who we are and how we feel1-3. On...

Please sign up or login with your details

Forgot password? Click here to reset