-
Squeezing value of cross-domain labels: a decoupled scoring approach for speaker verification
Domain mismatch often occurs in real applications and causes serious per...
read it
-
CN-Celeb: multi-genre speaker recognition
Research on speaker recognition is extending to address the vulnerabilit...
read it
-
Spherical sampling methods for the calculation of metamer mismatch volumes
In this paper, we propose two methods of calculating theoretically maxim...
read it
-
Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions
The performance of speaker verification systems degrades when vocal effo...
read it
-
Version Control of Speaker Recognition Systems
This paper discusses one of the most challenging practical engineering p...
read it
-
Channel adversarial training for cross-channel text-independent speaker recognition
The conventional speaker recognition frameworks (e.g., the i-vector and ...
read it
-
Objective Mismatch in Model-based Reinforcement Learning
Model-based reinforcement learning (MBRL) has been shown to be a powerfu...
read it
A Principle Solution for Enroll-Test Mismatch in Speaker Recognition
Mismatch between enrollment and test conditions causes serious performance degradation on speaker recognition systems. This paper presents a statistics decomposition (SD) approach to solve this problem. This approach is based on the normalized likelihood (NL) scoring framework, and is theoretically optimal if the statistics on both the enrollment and test conditions are accurate. A comprehensive experimental study was conducted on three datasets with different types of mismatch: (1) physical channel mismatch, (2) speaking style mismatch, (3) near-far recording mismatch. The results demonstrated that the proposed SD approach is highly effective, and outperforms the ad-hoc multi-condition training approach that is commonly adopted but not optimal in theory.
READ FULL TEXT
Comments
There are no comments yet.