Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection

06/21/2021
by   Kazuki Shimada, et al.
0

This report describes our systems submitted to the DCASE2021 challenge task 3: sound event localization and detection (SELD) with directional interference. Our previous system based on activity-coupled Cartesian direction of arrival (ACCDOA) representation enables us to solve a SELD task with a single target. This ACCDOA-based system with efficient network architecture called RD3Net and data augmentation techniques outperformed state-of-the-art SELD systems in terms of localization and location-dependent detection. Using the ACCDOA-based system as a base, we perform model ensembles by averaging outputs of several systems trained with different conditions such as input features, training folds, and model architectures. We also use the event independent network v2 (EINV2)-based system to increase the diversity of the model ensembles. To generalize the models, we further propose impulse response simulation (IRS), which generates simulated multi-channel signals by convolving simulated room impulse responses (RIRs) with source signals extracted from the original dataset. Our systems significantly improved over the baseline system on the development dataset.

READ FULL TEXT
research
10/13/2021

Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection

Recording and annotating real sound events for a sound event localizatio...
research
10/29/2020

ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection

Neural-network (NN)-based methods show high performance in sound event l...
research
06/22/2020

Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net

Our systems submitted to the DCASE2020 task 3: Sound Event Localization ...
research
09/06/2023

Leveraging Geometrical Acoustic Simulations of Spatial Room Impulse Responses for Improved Sound Event Detection and Localization

As deeper and more complex models are developed for the task of sound ev...
research
04/17/2023

Fast Random Approximation of Multi-channel Room Impulse Response

Modern neural-network-based speech processing systems are typically requ...
research
03/19/2022

A Track-Wise Ensemble Event Independent Network for Polyphonic Sound Event Localization and Detection

Polyphonic sound event localization and detection (SELD) aims at detecti...
research
10/18/2022

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

In this technical report, the systems we submitted for subtask 4 of the ...

Please sign up or login with your details

Forgot password? Click here to reset