Is this Harmful? Learning to Predict Harmfulness Ratings from Video

06/15/2021
by   Johan Edstedt, et al.
0

Automatically identifying harmful content in video is an important task with a wide range of applications. However, due to the difficulty of collecting high-quality labels as well as demanding computational requirements, the task has not had a satisfying general approach. Typically, only small subsets of the problem are considered, such as identifying violent content. In cases where the general problem is tackled, rough approximations and simplifications are made to deal with the lack of labels and computational complexity. In this work, we identify and tackle the two main obstacles. First, we create a dataset of approximately 4000 video clips, annotated by professionals in the field. Secondly, we demonstrate that advances in video recognition enable training models on our dataset that consider the full context of the scene. We conduct an in-depth study on our modeling choices and find that we greatly benefit from combining the visual and audio modality and that pretraining on large-scale video recognition datasets and class balanced sampling further improves performance. We additionally perform a qualitative study that reveals the heavily multi-modal nature of our dataset. Our dataset will be made available upon publication.

READ FULL TEXT

page 1

page 11

research
04/02/2023

Video Pretraining Advances 3D Deep Learning on Chest CT Tasks

Pretraining on large natural image classification datasets such as Image...
research
09/12/2021

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

Understanding movies and their structural patterns is a crucial task to ...
research
10/13/2021

NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels

Deep learning has shown remarkable progress in a wide range of problems....
research
09/15/2023

AV-MaskEnhancer: Enhancing Video Representations through Audio-Visual Masked Autoencoder

Learning high-quality video representation has shown significant applica...
research
12/27/2022

Audiovisual Database with 360 Video and Higher-Order Ambisonics Audio for Perception, Cognition, Behavior, and QoE Evaluation Research

Research into multi-modal perception, human cognition, behavior, and att...
research
06/13/2019

Grounding Object Detections With Transcriptions

A vast amount of audio-visual data is available on the Internet thanks t...

Please sign up or login with your details

Forgot password? Click here to reset