FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset

08/11/2021
by   Hasam Khalid, et al.
0

While significant advancements have been made in the generation of deepfakes using deep learning technologies, its misuse is a well-known issue now. Deepfakes can cause severe security and privacy issues as they can be used to impersonate a person's identity in a video by replacing his/her face with another person's face. Recently, a new problem of generating synthesized human voice of a person is emerging, where AI-based deep learning models can synthesize any person's voice requiring just a few seconds of audio. With the emerging threat of impersonation attacks using deepfake audios and videos, a new generation of deepfake detectors is needed to focus on both video and audio collectively. A large amount of good quality datasets is typically required to capture the real-world scenarios to develop a competent deepfake detector. Existing deepfake datasets either contain deepfake videos or audios, which are racially biased as well. Hence, there is a crucial need for creating a good video as well as an audio deepfake dataset, which can be used to detect audio and video deepfake simultaneously. To fill this gap, we propose a novel Audio-Video Deepfake dataset (FakeAVCeleb) that contains not only deepfake videos but also respective synthesized lip-synced fake audios. We generate this dataset using the current most popular deepfake generation methods. We selected real YouTube videos of celebrities with four racial backgrounds (Caucasian, Black, East Asian, and South Asian) to develop a more realistic multimodal dataset that addresses racial bias and further help develop multimodal deepfake detectors. We performed several experiments using state-of-the-art detection methods to evaluate our deepfake dataset and demonstrate the challenges and usefulness of our multimodal Audio-Video deepfake dataset.

READ FULL TEXT

page 4

page 5

research
09/07/2021

Evaluation of an Audio-Video Multimodal Deepfake Dataset using Unimodal and Multimodal Detectors

Significant advancements made in the generation of deepfakes have caused...
research
02/24/2020

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

Real-world talking faces often accompany with natural head movement. How...
research
04/06/2022

Audio-Visual Person-of-Interest DeepFake Detection

Face manipulation technology is advancing very rapidly, and new methods ...
research
02/25/2023

Why Do Deepfake Detectors Fail?

Recent rapid advancements in deepfake technology have allowed the creati...
research
09/27/2019

Celeb-DF: A New Dataset for DeepFake Forensics

AI-synthesized face swapping videos, commonly known as the DeepFakes, ha...
research
01/08/2023

Deepfake CAPTCHA: A Method for Preventing Fake Calls

Deep learning technology has made it possible to generate realistic cont...
research
11/21/2020

Stochastic Talking Face Generation Using Latent Distribution Matching

The ability to envisage the visual of a talking face based just on heari...

Please sign up or login with your details

Forgot password? Click here to reset