Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework

by   Rahil Parikh, et al.

Acoustic events are sounds with well-defined spectro-temporal characteristics which can be associated with the physical objects generating them. Acoustic scenes are collections of such acoustic events in no specific temporal order. Given this natural linkage between events and scenes, a common belief is that the ability to classify events must help in the classification of scenes. This has led to several efforts attempting to do well on Acoustic Event Tagging (AET) and Acoustic Scene Classification (ASC) using a multi-task network. However, in these efforts, improvement in one task does not guarantee an improvement in the other, suggesting a tension between ASC and AET. It is unclear if improvements in AET translates to improvements in ASC. We explore this conundrum through an extensive empirical study and show that under certain conditions, using AET as an auxiliary task in the multi-task network consistently improves ASC performance. Additionally, ASC performance further improves with the AET data-set size and is not sensitive to the choice of events or the number of events in the AET data-set. We conclude that this improvement in ASC performance comes from the regularization effect of using AET and not from the network's improved ability to discern between acoustic events.


page 1

page 2

page 3

page 4


How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks

Acoustic scene classification (ASC) and sound event detection (SED) are ...

Cross-task pre-training for acoustic scene classification

Acoustic scene classification(ASC) and acoustic event detection(AED) are...

Relation-guided acoustic scene classification aided with event embeddings

In real life, acoustic scenes and audio events are naturally correlated....

An evaluation framework for event detection using a morphological model of acoustic scenes

This paper introduces a model of environmental acoustic scenes which ado...

Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

In the analysis of acoustic scenes, often the occurring sounds have to b...

City classification from multiple real-world sound scenes

The majority of sound scene analysis work focuses on one of two clearly ...

Rare Life Event Detection via Mobile Sensing Using Multi-Task Learning

Rare life events significantly impact mental health, and their detection...

Please sign up or login with your details

Forgot password? Click here to reset