Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language

by   Arushi Sharma, et al.

Social media has become a bedrock for people to voice their opinions worldwide. Due to the greater sense of freedom with the anonymity feature, it is possible to disregard social etiquette online and attack others without facing severe consequences, inevitably propagating hate speech. The current measures to sift the online content and offset the hatred spread do not go far enough. One factor contributing to this is the prevalence of regional languages in social media and the paucity of language flexible hate speech detectors. The proposed work focuses on analyzing hate speech in Hindi-English code-switched language. Our method explores transformation techniques to capture precise text representation. To contain the structure of data and yet use it with existing algorithms, we developed MoH or Map Only Hindi, which means "Love" in Hindi. MoH pipeline consists of language identification, Roman to Devanagari Hindi transliteration using a knowledge base of Roman Hindi words. Finally, it employs the fine-tuned Multilingual Bert and MuRIL language models. We conducted several quantitative experiment studies on three datasets and evaluated performance using Precision, Recall, and F1 metrics. The first experiment studies MoH mapped text's performance with classical machine learning models and shows an average increase of 13 compares the proposed work's scores with those of the baseline models and offers a rise in performance by 6 technique with various data simulations using the existing transliteration library. Here, MoH outperforms the rest by 15 significant improvement in the state-of-the-art scores on all three datasets.


page 18

page 24

page 42


Role of Artificial Intelligence in Detection of Hateful Speech for Hinglish Data on Social Media

Social networking platforms provide a conduit to disseminate our ideas, ...

One to rule them all: Towards Joint Indic Language Hate Speech Detection

This paper is a contribution to the Hate Speech and Offensive Content Id...

Leveraging Language Identification to Enhance Code-Mixed Text Classification

The usage of more than one language in the same text is referred to as C...

Multilingual Hate Speech and Offensive Content Detection using Modified Cross-entropy Loss

The number of increased social media users has led to a lot of people mi...

An Online Multilingual Hate speech Recognition System

The exponential increase in the use of the Internet and social media ove...

The Enemy Among Us: Detecting Hate Speech with Threats Based 'Othering' Language Embeddings

Offensive or antagonistic language targeted at individuals and social gr...

Countering Online Hate Speech: An NLP Perspective

Online hate speech has caught everyone's attention from the news related...

Please sign up or login with your details

Forgot password? Click here to reset