WOLI at SemEval-2020 Task 12: Arabic Offensive Language Identification on Different Twitter Datasets

09/11/2020
by   Yasser Otiefy, et al.
0

Communicating through social platforms has become one of the principal means of personal communications and interactions. Unfortunately, healthy communication is often interfered by offensive language that can have damaging effects on the users. A key to fight offensive language on social media is the existence of an automatic offensive language detection system. This paper presents the results and the main findings of SemEval-2020, Task 12 OffensEval Sub-task A Zampieri et al. (2020), on Identifying and categorising Offensive Language in Social Media. The task was based on the Arabic OffensEval dataset Mubarak et al. (2020). In this paper, we describe the system submitted by WideBot AI Lab for the shared task which ranked 10th out of 52 participants with Macro-F1 86.9 "yasserotiefy". We experimented with various models and the best model is a linear SVM in which we use a combination of both character and word n-grams. We also introduced a neural network approach that enhanced the predictive ability of our system that includes CNN, highway network, Bi-LSTM, and attention layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2020

Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based Approach to Offensive Language Identification

This paper presents the models submitted by Ghmerti team for subtasks A ...
research
06/12/2020

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

We present the results and main findings of SemEval-2020 Task 12 on Mult...
research
04/17/2019

Amobee at SemEval-2019 Tasks 5 and 6: Multiple Choice CNN Over Contextual Embedding

This article describes Amobee's participation in "HatEval: Multilingual ...
research
06/16/2022

Deep Multi-Task Models for Misogyny Identification and Categorization on Arabic Social Media

The prevalence of toxic content on social media platforms, such as hate ...
research
04/19/2019

Identifying Offensive Posts and Targeted Offense from Twitter

In this paper we present our approach and the system description for Sub...
research
10/15/2019

Language Identification on Massive Datasets of Short Message using an Attention Mechanism CNN

Language Identification (LID) is a challenging task, especially when the...
research
08/10/2020

Question Identification in Arabic Language Using Emotional Based Features

With the growth of content on social media networks, enterprises and ser...

Please sign up or login with your details

Forgot password? Click here to reset