Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation

12/16/2021
by   Giscard Biamby, et al.
0

Detecting out-of-context media, such as "miscaptioned" images on Twitter, often requires detecting inconsistencies between the two modalities. This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program. First, we collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles. We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives. Our method is then tested on a hidden human-generated evaluation set. We achieve the best result on the program leaderboard, with 11 zero-shot CLIP baseline.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset