DeepAI AI Chat
Log In Sign Up

Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent Speech Separation

by   Ziye Yang, et al.

Recently, the research on ad-hoc microphone arrays with deep learning has drawn much attention, especially in speech enhancement and separation. Because an ad-hoc microphone array may cover such a large area that multiple speakers may locate far apart and talk independently, target-dependent speech separation, which aims to extract a target speaker from a mixed speech, is important for extracting and tracing a specific speaker in the ad-hoc array. However, this technique has not been explored yet. In this paper, we propose deep ad-hoc beamforming based on speaker extraction, which is to our knowledge the first work for target-dependent speech separation based on ad-hoc microphone arrays and deep learning. The algorithm contains three components. First, we propose a supervised channel selection framework based on speaker extraction, where the estimated utterance-level SNRs of the target speech are used as the basis for the channel selection. Second, we apply the selected channels to a deep learning based MVDR algorithm, where a single-channel speaker extraction algorithm is applied to each selected channel for estimating the mask of the target speech. We conducted an extensive experiment on a WSJ0-adhoc corpus. Experimental results demonstrate the effectiveness of the proposed method.


Frame-level multi-channel speaker verification with large-scale ad-hoc microphone arrays

Ad-hoc microphone arrays has recieved attention, in which the number and...

Deep Ad-hoc Beamforming

Deep learning based speech enhancement methods face two problems. First,...

Continuous Speech Separation with Ad Hoc Microphone Arrays

Speech separation has been shown effective for multi-talker speech recog...

Deep Learning Based Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays

Deep learning based speaker localization has shown its advantage in reve...

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

An important problem in ad-hoc microphone speech separation is how to gu...

End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays

Conventional sound source localization methods are mostly based on a sin...

Implicit Filter-and-sum Network for Multi-channel Speech Separation

Various neural network architectures have been proposed in recent years ...