DeepAI AI Chat
Log In Sign Up

Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection

by   Yu-Han Shen, et al.
Tsinghua University
NetEase, Inc

In this paper, we propose a temporal-frequential attention model for sound event detection (SED). Our network learns how to listen with two attention models: a temporal attention model and a frequential attention model. Proposed system learns when to listen using the temporal attention model while it learns where to listen on the frequency axis using the frequential attention model. With these two models, we attempt to make our system pay more attention to important frames or segments and important frequency components for sound event detection. Our proposed method is demonstrated on the task 2 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 Challenge and achieves competitive performance.


Multi-Scale Time-Frequency Attention for Rare Sound Event Detection

Attention mechanism has been widely applied to various sound-related tas...

Sound Event Detection with Adaptive Frequency Selection

In this work, we present HIDACT, a novel network architecture for adapti...

Furnishing Sound Event Detection with Language Model Abilities

Recently, the ability of language models (LMs) has attracted increasing ...

Channel-Spatial-Based Few-Shot Bird Sound Event Detection

In this paper, we propose a model for bird sound event detection that fo...

Acoustic scene analysis with multi-head attention networks

Acoustic Scene Classification (ASC) is a challenging task, as a single s...

A simple model for detection of rare sound events

We propose a simple recurrent model for detecting rare sound events, whe...

Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning

This paper proposes an effective modelling of sound event spectra with a...