Self-Supervised Euphemism Detection and Identification for Content Moderation

03/31/2021
by   Wanzheng Zhu, et al.
0

Fringe groups and organizations have a long history of using euphemisms–ordinary-sounding words with a secret meaning–to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a "ban list", but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider "pot" (storage container or marijuana?) or "heater" (household appliance or firearm?) The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind. This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30-400 of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators.

READ FULL TEXT

page 1

page 17

page 18

research
09/10/2021

Euphemistic Phrase Detection by Masked Language Model

It is a well-known approach for fringe groups and organizations to use e...
research
01/12/2020

Detecting New Word Meanings: A Comparison of Word Embedding Models in Spanish

Semantic neologisms (SN) are defined as words that acquire a new word me...
research
09/15/2022

TempoWiC: An Evaluation Benchmark for Detecting Meaning Shift in Social Media

Language evolves over time, and word meaning changes accordingly. This i...
research
11/28/2017

Surfacing contextual hate speech words within social media

Social media platforms have recently seen an increase in the occurrence ...
research
12/14/2022

ReDDIT: Regret Detection and Domain Identification from Text

In this paper, we present a study of regret and its expression on social...
research
01/20/2022

Regional Negative Bias in Word Embeddings Predicts Racial Animus–but only via Name Frequency

The word embedding association test (WEAT) is an important method for me...
research
07/09/2019

Hahahahaha, Duuuuude, Yeeessss!: A two-parameter characterization of stretchable words and the dynamics of mistypings and misspellings

Stretched words like `heellllp' or `heyyyyy' are a regular feature of sp...

Please sign up or login with your details

Forgot password? Click here to reset