Rethinking Multimodal Content Moderation from an Asymmetric Angle with Mixed-modality

05/17/2023
by   Jialin Yuan, et al.
0

There is a rapidly growing need for multimodal content moderation (CM) as more and more content on social media is multimodal in nature. Existing unimodal CM systems may fail to catch harmful content that crosses modalities (e.g., memes or videos), which may lead to severe consequences. In this paper, we present a novel CM model, Asymmetric Mixed-Modal Moderation (AM3), to target multimodal and unimodal CM tasks. Specifically, to address the asymmetry in semantics between vision and language, AM3 has a novel asymmetric fusion architecture that is designed to not only fuse the common knowledge in both modalities but also to exploit the unique information in each modality. Unlike previous works that focus on fusing the two modalities while overlooking the intrinsic difference between the information conveyed in multimodality and in unimodality (asymmetry in modalities), we propose a novel cross-modality contrastive loss to learn the unique knowledge that only appears in multimodality. This is critical as some harmful intent may only be conveyed through the intersection of both modalities. With extensive experiments, we show that AM3 outperforms all existing state-of-the-art methods on both multimodal and unimodal CM benchmarks.

READ FULL TEXT

page 1

page 3

page 8

research
05/31/2021

Multimodal Detection of Information Disorder from Social Media

Social media is accompanied by an increasing proportion of content that ...
research
08/07/2017

Multimodal Classification for Analysing Social Media

Classification of social media data is an important approach in understa...
research
08/21/2018

LRMM: Learning to Recommend with Missing Modalities

Multimodal learning has shown promising performance in content-based rec...
research
11/25/2020

Multimodal Learning for Hateful Memes Detection

Memes are multimedia documents containing images and phrases that usuall...
research
03/14/2021

Three Steps to Multimodal Trajectory Prediction: Modality Clustering, Classification and Synthesis

Multimodal prediction results are essential for trajectory forecasting t...
research
10/13/2020

Jointly Optimizing Sensing Pipelines for Multimodal Mixed Reality Interaction

Natural human interactions for Mixed Reality Applications are overwhelmi...
research
07/19/2023

Divert More Attention to Vision-Language Object Tracking

Multimodal vision-language (VL) learning has noticeably pushed the tende...

Please sign up or login with your details

Forgot password? Click here to reset