Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

12/18/2021
by   Zaki Mustafa Farooqi, et al.
13

In the current era of the internet, where social media platforms are easily accessible for everyone, people often have to deal with threats, identity attacks, hate, and bullying due to their association with a cast, creed, gender, religion, or even acceptance or rejection of a notion. Existing works in hate speech detection primarily focus on individual comment classification as a sequence labeling task and often fail to consider the context of the conversation. The context of a conversation often plays a substantial role when determining the author's intent and sentiment behind the tweet. This paper describes the system proposed by team MIDAS-IIITD for HASOC 2021 subtask 2, one of the first shared tasks focusing on detecting hate speech from Hindi-English code-mixed conversations on Twitter. We approach this problem using neural networks, leveraging the transformer's cross-lingual embeddings and further finetuning them for low-resource hate-speech classification in transliterated Hindi text. Our best performing system, a hard voting ensemble of Indic-BERT, XLM-RoBERTa, and Multilingual BERT, achieved a macro F1 score of 0.7253, placing us first on the overall leaderboard standings.

READ FULL TEXT

page 7

page 8

research
10/18/2021

Contextual Hate Speech Detection in Code Mixed Text using Transformer Based Approaches

In the recent past, social media platforms have helped people in connect...
research
10/21/2020

LT3 at SemEval-2020 Task 9: Cross-lingual Embeddings for Sentiment Analysis of Hinglish Social Media Text

This paper describes our contribution to the SemEval-2020 Task 9 on Sent...
research
05/28/2021

Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments

Social network platforms are generally used to share positive, construct...
research
04/19/2022

Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil

This paper tries to address the problem of abusive comment detection in ...
research
02/28/2021

NLP-CUET@DravidianLangTech-EACL2021: Offensive Language Detection from Multilingual Code-Mixed Text using Transformers

The increasing accessibility of the internet facilitated social media us...
research
05/12/2021

Multilingual Offensive Language Identification for Low-resource Languages

Offensive content is pervasive in social media and a reason for concern ...
research
10/12/2019

VAIS Hate Speech Detection System: A Deep Learning based Approach for System Combination

Nowadays, Social network sites (SNSs) such as Facebook, Twitter are comm...

Please sign up or login with your details

Forgot password? Click here to reset