Combining EfficientNet and Vision Transformers for Video Deepfake Detection

07/06/2021
by   Davide Coccomini, et al.
0

Deepfakes are the result of digital manipulation to obtain credible videos in order to deceive the viewer. This is done through deep learning techniques based on autoencoders or GANs that become more accessible and accurate year after year, resulting in fake videos that are very difficult to distinguish from real ones. Traditionally, CNN networks have been used to perform deepfake detection, with the best results obtained using methods based on EfficientNet B7. In this study, we combine various types of Vision Transformers with a convolutional EfficientNet B0 used as a feature extractor, obtaining comparable results with some very recent methods that use Vision Transformers. Differently from the state-of-the-art approaches, we use neither distillation nor ensemble methods. The best model achieved an AUC of 0.951 and an F1 score of 88.0 close to the state-of-the-art on the DeepFake Detection Challenge (DFDC).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2021

Deepfake Detection Scheme Based on Vision Transformer and Distillation

Deepfake is the manipulated video made with a generative deep learning t...
research
10/27/2022

Fully-attentive and interpretable: vision and video vision transformers for pain detection

Pain is a serious and costly issue globally, but to be treated, it must ...
research
06/28/2022

Cross-Forgery Analysis of Vision Transformers and CNNs for Deepfake Image Detection

Deepfake Generation Techniques are evolving at a rapid pace, making it p...
research
03/17/2023

Transformers and Ensemble methods: A solution for Hate Speech Detection in Arabic languages

This paper describes our participation in the shared task of hate speech...
research
02/25/2022

An exploration of the performances achievable by combining unsupervised background subtraction algorithms

Background subtraction (BGS) is a common choice for performing motion de...
research
01/28/2022

Detection of fake faces in videos

: Deep learning methodologies have been used to create applications that...
research
08/07/2021

Vision Transformers for femur fracture classification

Objectives: In recent years, the scientific community has focused on the...

Please sign up or login with your details

Forgot password? Click here to reset