Scaling sparsemax based channel selection for speech recognition with ad-hoc microphone arrays

03/29/2021
by   Junqi Chen, et al.
0

Recently, speech recognition with ad-hoc microphone arrays has received much attention. It is known that channel selection is an important problem of ad-hoc microphone arrays, however, this topic seems far from explored in speech recognition yet, particularly with a large-scale ad-hoc microphone array. To address this problem, we propose a Scaling Sparsemax algorithm for the channel selection problem of the speech recognition with large-scale ad-hoc microphone arrays. Specifically, we first replace the conventional Softmax operator in the stream attention mechanism of a multichannel end-to-end speech recognition system with Sparsemax, which conducts channel selection by forcing the channel weights of noisy channels to zero. Because Sparsemax punishes the weights of many channels to zero harshly, we propose Scaling Sparsemax which punishes the channels mildly by setting the weights of very noisy channels to zero only. Experimental results with ad-hoc microphone arrays of over 30 channels under the conformer speech recognition architecture show that the proposed Scaling Sparsemax yields a word error rate of over 30 data sets, and over 20 both matched and mismatched channel numbers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

Attention-based multi-channel speaker verification with ad-hoc microphone arrays

Recently, ad-hoc microphone array has been widely studied. Unlike tradit...
research
04/06/2021

Learning to Rank Microphones for Distant Speech Recognition

Fully exploiting ad-hoc microphone networks for distant speech recogniti...
research
10/12/2021

Frame-level multi-channel speaker verification with large-scale ad-hoc microphone arrays

Ad-hoc microphone arrays has recieved attention, in which the number and...
research
10/19/2022

Deep Learning Based Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays

Deep learning based speaker localization has shown its advantage in reve...
research
10/30/2019

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

An important problem in ad-hoc microphone speech separation is how to gu...
research
10/16/2022

End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays

Conventional sound source localization methods are mostly based on a sin...
research
01/24/2022

PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays

This paper proposes PickNet, a neural network model for real-time channe...

Please sign up or login with your details

Forgot password? Click here to reset