Data Augmentation and Squeeze-and-Excitation Network on Multiple Dimension for Sound Event Localization and Detection in Real Scenes

06/24/2022
by   Byeong-Yun Ko, et al.
0

Performance of sound event localization and detection (SELD) in real scenes is limited by small size of SELD dataset, due to difficulty in obtaining sufficient amount of realistic multi-channel audio data recordings with accurate label. We used two main strategies to solve problems arising from the small real SELD dataset. First, we applied various data augmentation methods on all data dimensions: channel, frequency and time. We also propose original data augmentation method named Moderate Mixup in order to simulate situations where noise floor or interfering events exist. Second, we applied Squeeze-and-Excitation block on channel and frequency dimensions to efficiently extract feature characteristics. Result of our trained models on the STARSS22 test dataset achieved the best ER, F1, LE, and LR of 0.53, 49.8 56.2

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2022

Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains

Sound event localization and detection (SELD) is a joint task of sound e...
research
07/08/2021

Heavily Augmented Sound Event Detection utilizing Weak Predictions

The performances of Sound Event Detection (SED) systems are greatly limi...
research
06/20/2023

Frequency Channel Attention for computationally efficient sound event detection

We explore on various attention methods on frequency and channel dimensi...
research
01/08/2021

A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection

In this paper, we propose a novel four-stage data augmentation approach ...
research
10/12/2021

Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection

Data augmentation methods have shown great importance in diverse supervi...
research
09/30/2021

Workflow Augmentation of Video Data for Event Recognition with Time-Sensitive Neural Networks

Supervised training of neural networks requires large, diverse and well ...
research
10/19/2020

BIRD: Big Impulse Response Dataset

This paper introduces BIRD, the Big Impulse Response Dataset. This open ...

Please sign up or login with your details

Forgot password? Click here to reset