Audio-Visual Person-of-Interest DeepFake Detection

04/06/2022
by   Davide Cozzolino, et al.
0

Face manipulation technology is advancing very rapidly, and new methods are being proposed day by day. The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world. Our key insight is that each person has specific biometric characteristics that a synthetic generator cannot likely reproduce. Accordingly, we extract high-level audio-visual biometric features which characterize the identity of a person, and use them to create a person-of-interest (POI) deepfake detector. We leverage a contrastive learning paradigm to learn the moving-face and audio segments embeddings that are most discriminative for each identity. As a result, when the video and/or audio of a person is manipulated, its representation in the embedding space becomes inconsistent with the real identity, allowing reliable detection. Training is carried out exclusively on real talking-face videos, thus the detector does not depend on any specific manipulation method and yields the highest generalization ability. In addition, our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos by building only on high-level semantic features. Experiments on a wide variety of datasets confirm that our method ensures a SOTA performance, with an average improvement in terms of AUC of around 3 respectively.

READ FULL TEXT

page 2

page 12

page 18

research
09/07/2021

Evaluation of an Audio-Video Multimodal Deepfake Dataset using Unimodal and Multimodal Detectors

Significant advancements made in the generation of deepfakes have caused...
research
08/11/2021

FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset

While significant advancements have been made in the generation of deepf...
research
12/04/2020

ID-Reveal: Identity-aware DeepFake Video Detection

State-of-the-art DeepFake forgery detectors are trained in a supervised ...
research
12/21/2021

Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

In today's era of digital misinformation, we are increasingly faced with...
research
09/28/2022

Deepfake audio detection by speaker verification

Thanks to recent advances in deep learning, sophisticated generation too...
research
03/30/2023

Diff-ID: An Explainable Identity Difference Quantification Framework for DeepFake Detection

Despite the fact that DeepFake forgery detection algorithms have achieve...
research
09/09/2022

Learning Audio-Visual embedding for Person Verification in the Wild

It has already been observed that audio-visual embedding is more robust ...

Please sign up or login with your details

Forgot password? Click here to reset