Deep Residual Network for Sound Source Localization in the Time Domain

08/20/2018
by   Dmitry Suvorov, et al.
0

This study presents a system for sound source localization in time domain using a deep residual neural network. Data from the linear 8 channel microphone array with 3 cm spacing is used by the network for direction estimation. We propose to use the deep residual network for sound source localization considering the localization task as a classification task. This study describes the gathered dataset and developed architecture of the neural network. We will show the training process and its result in this study. The developed system was tested on validation part of the dataset and on new data capture in real time. The accuracy classification of 30 m sec sound frames is 99.2 proposed method of sound source localization was tested inside of speech recognition pipeline. Its usage decreased word error rate by 1.14 comparison with similar speech recognition pipeline using GCC-PHAT sound source localization.

READ FULL TEXT

page 8

page 9

research
11/28/2019

Performance Comparison of UCA and UCCA based Real-time Sound Source Localization Systems using Circular Harmonics SRP Method

Many sound source localization (SSL) algorithms based on circular microp...
research
07/10/2023

EchoVest: Real-Time Sound Classification and Depth Perception Expressed through Transcutaneous Electrical Nerve Stimulation

Over 1.5 billion people worldwide live with hearing impairment. Despite ...
research
08/05/2021

SLoClas: A Database for Joint Sound Localization and Classification

In this work, we present the development of a new database, namely Sound...
research
10/12/2022

Enemy Spotted: in-game gun sound dataset for gunshot classification and localization

Recently, deep learning-based methods have drawn huge attention due to t...
research
02/16/2021

Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

Multi-source localization is an important and challenging technique for ...
research
12/10/2020

Data-Efficient Framework for Real-world Multiple Sound Source 2D Localization

Deep neural networks have recently led to promising results for the task...
research
04/17/2023

Fast Random Approximation of Multi-channel Room Impulse Response

Modern neural-network-based speech processing systems are typically requ...

Please sign up or login with your details

Forgot password? Click here to reset