Hateful Memes Detection via Complementary Visual and Linguistic Networks

12/09/2020
by   Weibo Zhang, et al.
0

Hateful memes are widespread in social media and convey negative information. The main challenge of hateful memes detection is that the expressive meaning can not be well recognized by a single modality. In order to further integrate modal information, we investigate a candidate solution based on complementary visual and linguistic network in Hateful Memes Challenge 2020. In this way, more comprehensive information of the multi-modality could be explored in detail. Both contextual-level and sensitive object-level information are considered in visual and linguistic embedding to formulate the complex multi-modal scenarios. Specifically, a pre-trained classifier and object detector are utilized to obtain the contextual features and region-of-interests (RoIs) from the input, followed by the position representation fusion for visual embedding. While linguistic embedding is composed of three components, i.e., the sentence words embedding, position embedding and the corresponding Spacy embedding (Sembedding), which is a symbol represented by vocabulary extracted by Spacy. Both visual and linguistic embedding are fed into the designed Complementary Visual and Linguistic (CVL) networks to produce the prediction for hateful memes. Experimental results on Hateful Memes Challenge Dataset demonstrate that CVL provides a decent performance, and produces 78:48 and 72:95 https://github.com/webYFDT/hateful.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2021

Efficient Multi-Modal Embeddings from Structured Data

Multi-modal word semantics aims to enhance embeddings with perceptual in...
research
04/05/2023

Quantifying the Roles of Visual, Linguistic, and Visual-Linguistic Complexity in Verb Acquisition

Children typically learn the meanings of nouns earlier than the meanings...
research
11/18/2022

Detect Only What You Specify : Object Detection with Linguistic Target

Object detection is a computer vision task of predicting a set of boundi...
research
04/21/2021

Comprehensive Multi-Modal Interactions for Referring Image Segmentation

We investigate Referring Image Segmentation (RIS), which outputs a segme...
research
06/26/2023

The Art of Embedding Fusion: Optimizing Hate Speech Detection

Hate speech detection is a challenging natural language processing task ...
research
10/05/2019

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

This work addresses the challenge of hate speech detection in Internet m...
research
12/03/2022

StegaNeRF: Embedding Invisible Information within Neural Radiance Fields

Recent advances in neural rendering imply a future of widespread visual ...

Please sign up or login with your details

Forgot password? Click here to reset