Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate

by   Hannah Rose Kirk, et al.
University of Oxford

Detecting online hate is a complex task, and low-performing models have harmful consequences when used for sensitive applications such as content moderation. Emoji-based hate is a key emerging challenge for automated detection. We present HatemojiCheck, a test suite of 3,930 short-form statements that allows us to evaluate performance on hateful language expressed with emoji. Using the test suite, we expose weaknesses in existing hate detection models. To address these weaknesses, we create the HatemojiTrain dataset using a human-and-model-in-the-loop approach. Models trained on these 5,912 adversarial examples perform substantially better at detecting emoji-based hate, while retaining strong performance on text-only hate. Both HatemojiCheck and HatemojiTrain are made publicly available.


page 6

page 7

page 16

page 17

page 18

page 19


ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

Toxic language detection systems often falsely flag text that contains m...

Semantic Evaluation for Text-to-SQL with Distilled Test Suites

We propose test suite accuracy to approximate semantic accuracy for Text...

GLTR: Statistical Detection and Visualization of Generated Text

The rapid improvement of language models has raised the specter of abuse...

COCO: The Large Scale Black-Box Optimization Benchmarking (bbob-largescale) Test Suite

The bbob-largescale test suite, containing 24 single-objective functions...

Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision

As open-ended human-chatbot interaction becomes commonplace, sensitive c...

HateCheck: Functional Tests for Hate Speech Detection Models

Detecting online hate is a difficult task that even state-of-the-art mod...

Benchmarking Specialized Databases for High-frequency Data

This paper presents a benchmarking suite designed for the evaluation and...

Please sign up or login with your details

Forgot password? Click here to reset