L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

04/12/2021
by   Eric Guizzo, et al.
0

The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dual-mic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-the-art architectures: FaSNet for SE and SELDNet for SELD. This report is aimed at providing all needed information to participate in the L3DAS21 Challenge, illustrating the details of the L3DAS21 dataset, the challenge tasks and the baseline models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2022

L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment

The L3DAS22 Challenge is aimed at encouraging the development of machine...
research
01/14/2020

HumBug Zooniverse: a crowd-sourced acoustic mosquito dataset

Mosquitoes are the only known vector of malaria, which leads to hundreds...
research
03/16/2021

DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics

The DiCOVA challenge aims at accelerating research in diagnosing COVID-1...
research
12/16/2021

Towards Robust Real-time Audio-Visual Speech Enhancement

The human brain contextually exploits heterogeneous sensory information ...
research
08/21/2020

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

In this paper, we present a deep learning-based speech signal-processing...
research
07/08/2018

Densely Connected CNNs for Bird Audio Detection

Detecting bird sounds in audio recordings automatically, if accurate eno...
research
08/18/2023

Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning

We present Spatial LibriSpeech, a spatial audio dataset with over 650 ho...

Please sign up or login with your details

Forgot password? Click here to reset