DeepAI AI Chat
Log In Sign Up

Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

by   Shruti Agarwal, et al.

In today's era of digital misinformation, we are increasingly faced with new threats posed by video falsification techniques. Such falsifications range from cheapfakes (e.g., lookalikes or audio dubbing) to deepfakes (e.g., sophisticated AI media synthesis methods), which are becoming perceptually indistinguishable from real videos. To tackle this challenge, we propose a multi-modal semantic forensic approach to discover clues that go beyond detecting discrepancies in visual quality, thereby handling both simpler cheapfakes and visually persuasive deepfakes. In this work, our goal is to verify that the purported person seen in the video is indeed themselves by detecting anomalous correspondences between their facial movements and the words they are saying. We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others. We use interpretable Action Units (AUs) to capture a persons' face and head movement as opposed to deep CNN visual features, and we are the first to use word-conditioned facial motion analysis. Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation. We further demonstrate our method's effectiveness on a range of fakes not seen in training including those without video manipulation, that were not addressed in prior work.


page 1

page 5

page 8

page 12

page 13


Multi Modal Adaptive Normalization for Audio to Video Generation

Speech-driven facial video generation has been a complex problem due to ...

FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction

DeepFake based digital facial forgery is threatening public media securi...

Audio-Visual Person-of-Interest DeepFake Detection

Face manipulation technology is advancing very rapidly, and new methods ...

Study of detecting behavioral signatures within DeepFake videos

There is strong interest in the generation of synthetic video imagery of...

Detecting Deep-Fake Videos from Appearance and Behavior

Synthetically-generated audios and videos – so-called deep fakes – conti...

FakeOut: Leveraging Out-of-domain Self-supervision for Multi-modal Video Deepfake Detection

Video synthesis methods rapidly improved in recent years, allowing easy ...

ID-Reveal: Identity-aware DeepFake Video Detection

State-of-the-art DeepFake forgery detectors are trained in a supervised ...