An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification

12/16/2021
by   Lam Pham, et al.
12

This paper presents a task of audio-visual scene classification (SC) where input videos are classified into one of five real-life crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'. To this end, we firstly collect an audio-visual dataset (videos) of these five crowded contexts from Youtube (in-the-wild scenes). Then, a wide range of deep learning frameworks are proposed to deploy either audio or visual input data independently. Finally, results obtained from high-performed deep learning frameworks are fused to achieve the best accuracy score. Our experimental results indicate that audio and visual input factors independently contribute to the SC task's performance. Significantly, an ensemble of deep learning frameworks exploring either audio or visual input data can achieve the best accuracy of 95.7

READ FULL TEXT

page 4

page 5

page 6

research
06/12/2021

Deep Learning Frameworks Applied For Audio-Visual Scene Classification

In this paper, we present deep learning frameworks for audio-visual scen...
research
01/09/2022

An Ensemble of Deep Learning Frameworks Applied For Predicting Respiratory Anomalies

In this paper, we evaluate various deep learning frameworks for detectin...
research
06/13/2022

Low-complexity deep learning frameworks for acoustic scene classification

In this report, we presents low-complexity deep learning frameworks for ...
research
05/28/2021

Audio-visual scene classification: analysis of DCASE 2021 Challenge submissions

This paper presents the details of the Audio-Visual Scene Classification...
research
04/30/2023

Deep Learning Based Multimodal with Two-phase Training Strategy for Daily Life Video Classification

In this paper, we present a deep learning based multimodal system for cl...
research
02/10/2022

Audio-Based Deep Learning Frameworks for Detecting COVID-19

This paper evaluates a wide range of audio-based deep learning framework...
research
12/21/2017

Wolf in Sheep's Clothing - The Downscaling Attack Against Deep Learning Applications

This paper considers security risks buried in the data processing pipeli...

Please sign up or login with your details

Forgot password? Click here to reset