Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset

07/09/2021
by   Hannah Rose Kirk, et al.
0

Hateful memes pose a unique challenge for current machine learning systems because their message is derived from both text- and visual-modalities. To this effect, Facebook released the Hateful Memes Challenge, a dataset of memes with pre-extracted text captions, but it is unclear whether these synthetic examples generalize to `memes in the wild'. In this paper, we collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset. We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, injecting noise and diminishing performance of multimodal models, and 2) Memes are more diverse than `traditional memes', including screenshots of conversations or text on a plain background. This paper thus serves as a reality check for the current benchmark of hateful meme detection and its applicability for detecting real world hate.

READ FULL TEXT

page 8

page 10

research
09/28/2022

Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text

Multimodal learning is a recent challenge that extends unimodal learning...
research
12/31/2021

Clustering Vietnamese Conversations From Facebook Page To Build Training Dataset For Chatbot

The biggest challenge of building chatbots is training data. The require...
research
05/08/2023

IIITD-20K: Dense captioning for Text-Image ReID

Text-to-Image (T2I) ReID has attracted a lot of attention in the recent ...
research
07/01/2023

Image Matters: A New Dataset and Empirical Study for Multimodal Hyperbole Detection

Hyperbole, or exaggeration, is a common linguistic phenomenon. The detec...
research
03/29/2023

RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild Recognition

Information surrounds people in modern life. Text is a very efficient ty...
research
03/07/2021

Deepfake Videos in the Wild: Analysis and Detection

AI-manipulated videos, commonly known as deepfakes, are an emerging prob...
research
10/31/2012

Understanding the Interaction between Interests, Conversations and Friendships in Facebook

In this paper, we explore salient questions about user interests, conver...

Please sign up or login with your details

Forgot password? Click here to reset