Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays

07/03/2023
by   Yijiang Chen, et al.
0

The performance of speaker verification degrades significantly in adverse acoustic environments with strong reverberation and noise. To address this issue, this paper proposes a spatial-temporal graph convolutional network (GCN) method for the multi-channel speaker verification with ad-hoc microphone arrays. It includes a feature aggregation block and a channel selection block, both of which are built on graphs. The feature aggregation block fuses speaker features among different time and channels by a spatial-temporal GCN. The graph-based channel selection block discards the noisy channels that may contribute negatively to the system. The proposed method is flexible in incorporating various kinds of graphs and prior knowledge. We compared the proposed method with six representative methods in both real-world and simulated environments. Experimental results show that the proposed method achieves a relative equal error rate (EER) reduction of 15.39% lower than the strongest referenced method in the simulated datasets, and 17.70% lower than the latter in the real datasets. Moreover, its performance is robust across different signal-to-noise ratios and reverberation time.

READ FULL TEXT

page 1

page 5

research
07/01/2021

Attention-based multi-channel speaker verification with ad-hoc microphone arrays

Recently, ad-hoc microphone array has been widely studied. Unlike tradit...
research
10/12/2021

Frame-level multi-channel speaker verification with large-scale ad-hoc microphone arrays

Ad-hoc microphone arrays has recieved attention, in which the number and...
research
10/19/2022

Deep Learning Based Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays

Deep learning based speaker localization has shown its advantage in reve...
research
10/16/2022

End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays

Conventional sound source localization methods are mostly based on a sin...
research
01/24/2022

PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays

This paper proposes PickNet, a neural network model for real-time channe...
research
01/20/2020

A graph-based spatial temporal logic for knowledge representation and automated reasoning in cognitive robots

A new graph-based spatial temporal logic is proposed for knowledge repre...
research
10/06/2020

A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments

Speaker verification (SV) has recently attracted considerable research i...

Please sign up or login with your details

Forgot password? Click here to reset