A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data

12/01/2020
by   Weicheng Cai, et al.
0

This paper proposes a unified deep speaker embedding framework for modeling speech data with different sampling rates. Considering the narrowband spectrogram as a sub-image of the wideband spectrogram, we tackle the joint modeling problem of the mixed-bandwidth data in an image classification manner. From this perspective, we elaborate several mixed-bandwidth joint training strategies under different training and test data scenarios. The proposed systems are able to flexibly handle the mixed-bandwidth speech data in a single speaker embedding model without any additional downsampling, upsampling, bandwidth extension, or padding operations. We conduct extensive experimental studies on the VoxCeleb1 dataset. Furthermore, the effectiveness of the proposed approach is validated by the SITW and NIST SRE 2016 datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/05/2019

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition

In this paper, we tackle the problem of handling narrowband and wideband...
02/24/2022

On the relevance of bandwidth extension for speaker identification

In this paper we discuss the relevance of bandwidth extension for speake...
03/30/2022

Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification

Speech systems developed for a particular choice of acoustic domain and ...
07/19/2019

DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis

This paper proposes novel algorithms for speaker embedding using subject...
01/14/2020

Supervised Speaker Embedding De-Mixing in Two-Speaker Environment

In this work, a speaker embedding de-mixing approach is proposed. Instea...
04/05/2022

On the Relevance of Bandwidth Extension for Speaker Verification

In this paper, we consider the effect of a bandwidth extension of narrow...
05/09/2022

Bandwidth-Scalable Fully Mask-Based Deep FCRN Acoustic Echo Cancellation and Postfiltering

Although today's speech communication systems support various bandwidths...