Exploring Spatial-Temporal Features for Deepfake Detection and Localization

10/28/2022
by   Wu Haiwei, et al.
0

With the continuous research on Deepfake forensics, recent studies have attempted to provide the fine-grained localization of forgeries, in addition to the coarse classification at the video-level. However, the detection and localization performance of existing Deepfake forensic methods still have plenty of room for further improvement. In this work, we propose a Spatial-Temporal Deepfake Detection and Localization (ST-DDL) network that simultaneously explores spatial and temporal features for detecting and localizing forged regions. Specifically, we design a new Anchor-Mesh Motion (AMM) algorithm to extract temporal (motion) features by modeling the precise geometric movements of the facial micro-expression. Compared with traditional motion extraction methods (e.g., optical flow) designed to simulate large-moving objects, our proposed AMM could better capture the small-displacement facial features. The temporal features and the spatial features are then fused in a Fusion Attention (FA) module based on a Transformer architecture for the eventual Deepfake forensic tasks. The superiority of our ST-DDL network is verified by experimental comparisons with several state-of-the-art competitors, in terms of both video- and pixel-level detection and localization performance. Furthermore, to impel the future development of Deepfake forensics, we build a public forgery dataset consisting of 6000 videos, with many new features such as using widely-used commercial software (e.g., After Effects) for the production, providing online social networks transmitted versions, and splicing multi-source videos. The source code and dataset are available at https://github.com/HighwayWu/ST-DDL.

READ FULL TEXT

page 2

page 6

page 8

page 9

research
12/16/2020

C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer

Human video motion transfer (HVMT) aims to synthesize videos that one pe...
research
12/09/2020

DS-Net: Dynamic Spatiotemporal Network for Video Salient Object Detection

As moving objects always draw more attention of human eyes, the temporal...
research
09/18/2023

Spatio-temporal Co-attention Fusion Network for Video Splicing Localization

Digital video splicing has become easy and ubiquitous. Malicious users c...
research
12/07/2020

MERANet: Facial Micro-Expression Recognition using 3D Residual Attention Network

We propose a facial micro-expression recognition model using 3D residual...
research
10/07/2021

MGPSN: Motion-Guided Pseudo Siamese Network for Indoor Video Head Detection

Head detection in real-world videos is an important research topic in co...
research
08/02/2021

I2V-GAN: Unpaired Infrared-to-Visible Video Translation

Human vision is often adversely affected by complex environmental factor...
research
04/21/2021

Machine vision detection to daily facial fatigue with a nonlocal 3D attention network

Fatigue detection is valued for people to keep mental health and prevent...

Please sign up or login with your details

Forgot password? Click here to reset