Misinformation Detection on YouTube Using Video Captions

07/02/2021
by   Raj Jagtap, et al.
10

Millions of people use platforms such as YouTube, Facebook, Twitter, and other mass media. Due to the accessibility of these platforms, they are often used to establish a narrative, conduct propaganda, and disseminate misinformation. This work proposes an approach that uses state-of-the-art NLP techniques to extract features from video captions (subtitles). To evaluate our approach, we utilize a publicly accessible and labeled dataset for classifying videos as misinformation or not. The motivation behind exploring video captions stems from our analysis of videos metadata. Attributes such as the number of views, likes, dislikes, and comments are ineffective as videos are hard to differentiate using this information. Using caption dataset, the proposed models can classify videos among three classes (Misinformation, Debunking Misinformation, and Neutral) with 0.85 to 0.90 F1-score. To emphasize the relevance of the misinformation class, we re-formulate our classification problem as a two-class classification - Misinformation vs. others (Debunking Misinformation and Neutral). In our experiments, the proposed models can classify videos with 0.92 to 0.95 F1-score and 0.78 to 0.90 AUC ROC.

READ FULL TEXT

page 4

page 5

research
01/26/2022

An Exploration of Captioning Practices and Challenges of Individual Content Creators on YouTube for People with Hearing Impairments

Deaf and Hard-of-Hearing (DHH) audiences have long complained about capt...
research
05/06/2023

HateMM: A Multi-Modal Dataset for Hate Video Classification

Hate speech has become one of the most significant issues in modern soci...
research
08/27/2018

Approach for Video Classification with Multi-label on YouTube-8M Dataset

Video traffic is increasing at a considerable rate due to the spread of ...
research
02/10/2022

The MeLa BitChute Dataset

In this paper we present a near-complete dataset of over 3M videos from ...
research
08/09/2019

Interactive Variance Attention based Online Spoiler Detection for Time-Sync Comments

Nowadays, time-sync comment (TSC), a new form of interactive comments, h...
research
01/26/2021

Efficient video integrity analysis through container characterization

Most video forensic techniques look for traces within the data stream th...
research
03/07/2023

At Your Fingertips: Extracting Piano Fingering Instructions from Videos

Piano fingering – knowing which finger to use to play each note in a mus...

Please sign up or login with your details

Forgot password? Click here to reset