Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media

by   Sudhanshu Mishra, et al.

Hate Speech has become a major content moderation issue for online social media platforms. Given the volume and velocity of online content production, it is impossible to manually moderate hate speech related content on any platform. In this paper we utilize a multi-task and multi-lingual approach based on recently proposed Transformer Neural Networks to solve three sub-tasks for hate speech. These sub-tasks were part of the 2019 shared task on hate speech and offensive content (HASOC) identification in Indo-European languages. We expand on our submission to that competition by utilizing multi-task models which are trained using three approaches, a) multi-task learning with separate task heads, b) back-translation, and c) multi-lingual training. Finally, we investigate the performance of various models and identify instances where the Transformer based models perform differently and better. We show that it is possible to to utilize different combined approaches to obtain models that can generalize easily on different languages and tasks, while trading off slight accuracy (in some cases) for a much reduced inference time compute cost. We open source an updated version of our HASOC 2019 code with the new improvements at https://github.com/socialmediaie/MTML_HateSpeech.


page 8

page 27

page 28


Hate Speech and Offensive Language Detection using an Emotion-aware Shared Encoder

The rise of emergence of social media platforms has fundamentally altere...

HateMonitors: Language Agnostic Abuse Detection in Social Media

Reducing hateful and offensive content in online social media pose a dua...

L3Cube-MahaHate: A Tweet-based Marathi Hate Speech Detection Dataset and BERT models

Social media platforms are used by a large number of people prominently ...

Deep Multi-Task Models for Misogyny Identification and Categorization on Arabic Social Media

The prevalence of toxic content on social media platforms, such as hate ...

IIITT@LT-EDI-EACL2021-Hope Speech Detection: There is always Hope in Transformers

In a world filled with serious challenges like climate change, religious...

Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection

Social media often acts as breeding grounds for different forms of offen...

Issue Framing in Online Discussion Fora

In online discussion fora, speakers often make arguments for or against ...

Please sign up or login with your details

Forgot password? Click here to reset