On Explaining Multimodal Hateful Meme Detection Models

04/04/2022
by   Ming Shan Hee, et al.
0

Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visual-text slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions.

READ FULL TEXT

page 3

page 4

research
08/09/2021

Disentangling Hate in Online Memes

Hateful and offensive content detection has been extensively explored in...
research
12/15/2020

Enhance Multimodal Transformer With External Label And In-Domain Pretrain: Hateful Meme Challenge Winning Solution

Hateful meme detection is a new research area recently brought out that ...
research
10/13/2021

MMIU: Dataset for Visual Intent Understanding in Multimodal Assistants

In multimodal assistant, where vision is also one of the input modalitie...
research
08/31/2020

Toward Multimodal Modeling of Emotional Expressiveness

Emotional expressiveness captures the extent to which a person tends to ...
research
11/01/2018

Shifting the Baseline: Single Modality Performance on Visual Navigation & QA

Language-and-vision navigation and question answering (QA) are exciting ...
research
04/16/2021

Does language help generalization in vision models?

Vision models trained on multimodal datasets have recently proved very e...
research
04/05/2023

Quantifying the Roles of Visual, Linguistic, and Visual-Linguistic Complexity in Verb Acquisition

Children typically learn the meanings of nouns earlier than the meanings...

Please sign up or login with your details

Forgot password? Click here to reset