Towards Understanding of Deepfake Videos in the Wild

by   Beomsang Cho, et al.

Deepfakes have become a growing concern in recent years, prompting researchers to develop benchmark datasets and detection algorithms to tackle the issue. However, existing datasets suffer from significant drawbacks that hamper their effectiveness. Notably, these datasets fail to encompass the latest deepfake videos produced by state-of-the-art methods that are being shared across various platforms. This limitation impedes the ability to keep pace with the rapid evolution of generative AI techniques employed in real-world deepfake production. Our contributions in this IRB-approved study are to bridge this knowledge gap from current real-world deepfakes by providing in-depth analysis. We first present the largest and most diverse and recent deepfake dataset (RWDF-23) collected from the wild to date, consisting of 2,000 deepfake videos collected from 4 platforms targeting 4 different languages span created from 21 countries: Reddit, YouTube, TikTok, and Bilibili. By expanding the dataset's scope beyond the previous research, we capture a broader range of real-world deepfake content, reflecting the ever-evolving landscape of online platforms. Also, we conduct a comprehensive analysis encompassing various aspects of deepfakes, including creators, manipulation strategies, purposes, and real-world content production methods. This allows us to gain valuable insights into the nuances and characteristics of deepfakes in different contexts. Lastly, in addition to the video content, we also collect viewer comments and interactions, enabling us to explore the engagements of internet users with deepfake content. By considering this rich contextual information, we aim to provide a holistic understanding of the evolving deepfake phenomenon and its impact on online platforms.


page 1

page 2

page 3

page 4


Deepfake Videos in the Wild: Analysis and Detection

AI-manipulated videos, commonly known as deepfakes, are an emerging prob...

Collecting, Classifying, Analyzing, and Using Real-World Elections

We present a collection of 7582 real-world elections divided into 25 dat...

Towards Reliable Online Clickbait Video Detection: A Content-Agnostic Approach

Online video sharing platforms (e.g., YouTube, Vimeo) have become an inc...

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

In recent years, the abuse of a face swap technique called deepfake Deep...

Dataset for Identification of Homophobia and Transophobia in Multilingual YouTube Comments

The increased proliferation of abusive content on social media platforms...

Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis

As tools for content editing mature, and artificial intelligence (AI) ba...

Predicting Knowledge Gain for MOOC Video Consumption

Informal learning on the Web using search engines as well as more struct...

Please sign up or login with your details

Forgot password? Click here to reset