Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language

03/18/2021
by   Hala Mulki, et al.
0

Online misogyny has become an increasing worry for Arab women who experience gender-based online abuse on a daily basis. Misogyny automatic detection systems can assist in the prohibition of anti-women Arabic toxic content. Developing such systems is hindered by the lack of the Arabic misogyny benchmark datasets. In this paper, we introduce an Arabic Levantine Twitter dataset for Misogynistic language (LeT-Mi) to be the first benchmark dataset for Arabic misogyny. We further provide a detailed review of the dataset creation and annotation phases. The consistency of the annotations for the proposed dataset was emphasized through inter-rater agreement evaluation measures. Moreover, Let-Mi was used as an evaluation dataset through binary/multi-/target classification tasks conducted by several state-of-the-art machine learning systems along with Multi-Task Learning (MTL) configuration. The obtained results indicated that the performances achieved by the used systems are consistent with state-of-the-art results for languages other than Arabic, while employing MTL improved the performance of the misogyny/target classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2020

Arabic Offensive Language on Twitter: Analysis and Experiments

Detecting offensive language on Twitter has many applications ranging fr...
research
08/23/2018

Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification

In this paper, we present Arap-Tweet, which is a large-scale and multi-d...
research
11/01/2020

ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset

This paper provides a detailed description of a new Twitter-based benchm...
research
12/18/2020

A Benchmark Arabic Dataset for Commonsense Explanation

Language comprehension and commonsense knowledge validation by machines ...
research
05/24/2023

Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation

Grammatical error correction (GEC) is a well-explored problem in English...
research
05/12/2015

A Survey of Arabic Dialogues Understanding for Spontaneous Dialogues and Instant Message

Building dialogues systems interaction has recently gained considerable ...
research
01/03/2023

An ensemble-based framework for mispronunciation detection of Arabic phonemes

Determination of mispronunciations and ensuring feedback to users are ma...

Please sign up or login with your details

Forgot password? Click here to reset