How to (virtually) train your sound source localizer

11/30/2022
by   Prerak Srivastava, et al.
0

Learning-based methods have become ubiquitous in sound source localization (SSL). Existing systems rely on simulated training sets for the lack of sufficiently large, diverse and annotated real datasets. Most room acoustic simulators used for this purpose rely on the image source method (ISM) because of its computational efficiency. This paper argues that carefully extending the ISM to incorporate more realistic surface, source and microphone responses into training sets can significantly boost the real-world performance of SSL systems. It is shown that increasing the training-set realism of a state-of-the-art direction-of-arrival estimator yields consistent improvements across three different real test sets featuring human speakers in a variety of rooms and various microphone arrays. An ablation study further reveals that every added layer of realism contributes positively to these improvements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2023

Sound Source Distance Estimation in Diverse and Dynamic Acoustic Conditions

Localizing a moving sound source in the real world involves determining ...
research
07/19/2022

Realistic sources, receivers and walls improve the generalisability of virtually-supervised blind acoustic parameter estimators

Blind acoustic parameter estimation consists in inferring the acoustic p...
research
10/12/2022

Enemy Spotted: in-game gun sound dataset for gunshot classification and localization

Recently, deep learning-based methods have drawn huge attention due to t...
research
09/06/2023

Leveraging Geometrical Acoustic Simulations of Spatial Room Impulse Responses for Improved Sound Event Detection and Localization

As deeper and more complex models are developed for the task of sound ev...
research
06/25/2021

Evaluation of Deep-Learning-Based Voice Activity Detectors and Room Impulse Response Models in Reverberant Environments

State-of-the-art deep-learning-based voice activity detectors (VADs) are...
research
10/29/2021

Differentiable Tracking-Based Training of Deep Learning Sound Source Localizers

Data-based and learning-based sound source localization (SSL) has shown ...

Please sign up or login with your details

Forgot password? Click here to reset