Beyond Equal-Length Snippets: How Long is Sufficient to Recognize an Audio Scene?

11/02/2018
by   Oliver Y. Chén, et al.
0

Due to the variability in characteristics of audio scenes, some can naturally be recognized earlier, i.e. after a shorter duration, than others. In this work, rather than using equal-length snippets for all scene categories, as is common in the literature, we study to which temporal extent an audio scene can be reliably recognized. For modelling, in addition to two single-network systems relying on a convolutional neural network and a recurrent neural network, we also investigate early fusion and late fusion of these two single networks for audio scene classification. Moreover, as model fusion is prevalent in audio scene classifiers, we further aim to study whether and when model fusion is really necessary for this task.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
07/28/2021

Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification

The use of multiple and semantically correlated sources can provide comp...
research
07/11/2020

Look and Listen: A Multi-modality Late Fusion Approach to Scene Classification for Autonomous Machines

The novelty of this study consists in a multi-modality approach to scene...
research
06/19/2018

A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification

In the past, Acoustic Scene Classification systems have been based on ha...
research
07/08/2016

CNN-LTE: a Class of 1-X Pooling Convolutional Neural Networks on Label Tree Embeddings for Audio Scene Recognition

We describe in this report our audio scene recognition system submitted ...
research
08/03/2022

Audio-visual scene classification via contrastive event-object alignment and semantic-based fusion

Previous works on scene classification are mainly based on audio or visu...
research
07/15/2019

Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling

This technical report describes the IOA team's submission for TASK1A of ...
research
07/06/2020

Acoustic Scene Classification with Spectrogram Processing Strategies

Recently, convolutional neural networks (CNN) have achieved the state-of...

Please sign up or login with your details

Forgot password? Click here to reset