AlexU-AIC at Arabic Hate Speech 2022: Contrast to Classify

by   Ahmad Shapiro, et al.

Online presence on social media platforms such as Facebook and Twitter has become a daily habit for internet users. Despite the vast amount of services the platforms offer for their users, users suffer from cyber-bullying, which further leads to mental abuse and may escalate to cause physical harm to individuals or targeted groups. In this paper, we present our submission to the Arabic Hate Speech 2022 Shared Task Workshop (OSACT5 2022) using the associated Arabic Twitter dataset. The shared task consists of 3 sub-tasks, sub-task A focuses on detecting whether the tweet is offensive or not. Then, For offensive Tweets, sub-task B focuses on detecting whether the tweet is hate speech or not. Finally, For hate speech Tweets, sub-task C focuses on detecting the fine-grained type of hate speech among six different classes. Transformer models proved their efficiency in classification tasks, but with the problem of over-fitting when fine-tuned on a small or an imbalanced dataset. We overcome this limitation by investigating multiple training paradigms such as Contrastive learning and Multi-task learning along with Classification fine-tuning and an ensemble of our top 5 performers. Our proposed solution achieved 0.841, 0.817, and 0.476 macro F1-average in sub-tasks A, B, and C respectively.


page 1

page 2

page 3

page 4


Arabic Dialect Identification Using BERT-Based Domain Adaptation

Arabic is one of the most important and growing languages in the world. ...

Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech

We introduce a generic, language-independent method to collect a large p...

Meta AI at Arabic Hate Speech 2022: MultiTask Learning with Self-Correction for Hate Speech Classification

In this paper, we tackle the Arabic Fine-Grained Hate Speech Detection s...

Detecting and Reasoning of Deleted Tweets before they are Posted

Social media platforms empower us in several ways, from information diss...

OSACT4 Shared Task on Offensive Language Detection: Intensive Preprocessing-Based Approach

The preprocessing phase is one of the key phases within the text classif...

Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0

Stuttering is a varied speech disorder that harms an individual's commun...

HSD Shared Task in VLSP Campaign 2019:Hate Speech Detection for Social Good

The paper describes the organisation of the "HateSpeech Detection" (HSD)...

Please sign up or login with your details

Forgot password? Click here to reset