AI Chat AI Image Generator AI Video Text to Speech

Enhance Multimodal Transformer With External Label And In-Domain Pretrain: Hateful Meme Challenge Winning Solution

12/15/2020

∙

by Ron Zhu, et al.

∙

∙

Hateful meme detection is a new research area recently brought out that requires both visual, linguistic understanding of the meme and some background knowledge to performing well on the task. This technical report summarises the first place solution of the Hateful Meme Detection Challenge 2020, which extending state-of-the-art visual-linguistic transformers to tackle this problem. At the end of the report, we also point out the shortcomings and possible directions for improving the current methodology.

research

∙ 04/04/2022

On Explaining Multimodal Hateful Meme Detection Models

Hateful meme detection is a new multimodal task that has gained signific...

0 Ming Shan Hee, et al. ∙

research

∙ 06/28/2023

The 2nd Place Solution for 2023 Waymo Open Sim Agents Challenge

In this technical report, we present the 2nd place solution of 2023 Waym...

0 Cheng Qian, et al. ∙

research

∙ 02/28/2021

RuSentEval: Linguistic Source, Encoder Force!

The success of pre-trained transformer language models has brought a gre...

0 Vladislav Mikhailov, et al. ∙

research

∙ 09/09/2021

TxT: Crossmodal End-to-End Learning with Transformers

Reasoning over multiple modalities, e.g. in Visual Question Answering (V...

14 Jan-Martin O. Steitz, et al. ∙

research

∙ 09/21/2022

Exploring Modulated Detection Transformer as a Tool for Action Recognition in Videos

During recent years transformers architectures have been growing in popu...

0 Tomás Crisol, et al. ∙

research

∙ 10/05/2020

Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition

This paper aims to bring a new lightweight yet powerful solution for the...

0 Jean-Benoit Delbrouck, et al. ∙

research

∙ 05/05/2021

Visual Composite Set Detection Using Part-and-Sum Transformers

Computer vision applications such as visual relationship detection and h...

8 Qi Dong, et al. ∙