Multi-Scale Time-Frequency Attention for Rare Sound Event Detection

03/29/2019
by   Jingyang Zhang, et al.
0

Attention mechanism has been widely applied to various sound-related tasks. In this work, we propose a Multi-Scale Time-Frequency Attention (MTFA) module for sound event detection. By generating an attention heatmap, MTFA enables the model to focus on discriminative components of the spectrogram along both time and frequency axis. Besides, gathering information at multiple scales helps the model adapt better to the characteristics of different categories of target events. The proposed method is demonstrated on task 2 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 Challenge. To the best of our knowledge, our method outperforms all previous methods that don't use model ensemble on development dataset and achieves state-of-the-art on evaluation dataset by reducing the error rate to 0.09 from 0.13. This demonstrates the effectiveness of MTFA on retrieving discriminative representations for sound event detection.

READ FULL TEXT
research
10/29/2018

Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection

In this paper, we propose a temporal-frequential attention model for sou...
research
11/15/2019

Adaptive Multi-scale Detection of Acoustic Events

The goal of acoustic (or sound) events detection (AED or SED) is to pred...
research
11/25/2021

Polyphonic Sound Event Detection Using Capsule Neural Network on Multi-Type-Multi-Scale Time-Frequency Representation

The challenges of polyphonic sound event detection (PSED) stem from the ...
research
08/04/2019

Sound Event Detection in Multichannel Audio using Convolutional Time-Frequency-Channel Squeeze and Excitation

In this study, we introduce a convolutional time-frequency-channel "Sque...
research
06/21/2022

A Multi-grained based Attention Network for Semi-supervised Sound Event Detection

Sound event detection (SED) is an interesting but challenging task due t...
research
05/17/2021

Sound Event Detection with Adaptive Frequency Selection

In this work, we present HIDACT, a novel network architecture for adapti...
research
09/16/2019

Acoustic scene analysis with multi-head attention networks

Acoustic Scene Classification (ASC) is a challenging task, as a single s...

Please sign up or login with your details

Forgot password? Click here to reset