HateCheckHIn: Evaluating Hindi Hate Speech Detection Models

04/30/2022
by   Mithun Das, et al.
4

Due to the sheer volume of online hate, the AI and NLP communities have started building models to detect such hateful content. Recently, multilingual hate is a major emerging challenge for automated detection where code-mixing or more than one language have been used for conversation in social media. Typically, hate speech detection models are evaluated by measuring their performance on the held-out test data using metrics such as accuracy and F1-score. While these metrics are useful, it becomes difficult to identify using them where the model is failing, and how to resolve it. To enable more targeted diagnostic insights of such multilingual hate speech models, we introduce a set of functionalities for the purpose of evaluation. We have been inspired to design this kind of functionalities based on real-world conversation on social media. Considering Hindi as a base language, we craft test cases for each functionality. We name our evaluation dataset HateCheckHIn. To illustrate the utility of these functionalities , we test state-of-the-art transformer based m-BERT model and the Perspective API.

READ FULL TEXT
research
12/31/2020

HateCheck: Functional Tests for Hate Speech Detection Models

Detecting online hate is a difficult task that even state-of-the-art mod...
research
01/08/2021

Leveraging Multilingual Transformers for Hate Speech Detection

Detecting and classifying instances of hate in social media text has bee...
research
10/18/2021

Contextual Hate Speech Detection in Code Mixed Text using Transformer Based Approaches

In the recent past, social media platforms have helped people in connect...
research
09/14/2022

BERT-based Ensemble Approaches for Hate Speech Detection

With the freedom of communication provided in online social media, hate ...
research
03/22/2022

Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments

Building on current work on multilingual hate speech (e.g., Ousidhoum et...
research
02/18/2021

MUDES: Multilingual Detection of Offensive Spans

The interest in offensive content identification in social media has gro...
research
03/22/2023

Evaluating the Role of Target Arguments in Rumour Stance Classification

Considering a conversation thread, stance classification aims to identif...

Please sign up or login with your details

Forgot password? Click here to reset