Improving fairness in speaker verification via Group-adapted Fusion Network

02/23/2022
by   Hua Shen, et al.
0

Modern speaker verification models use deep neural networks to encode utterance audio into discriminative embedding vectors. During the training process, these networks are typically optimized to differentiate arbitrary speakers. This learning process biases the learning of fine voice characteristics towards dominant demographic groups, which can lead to an unfair performance disparity across different groups. This is observed especially with underrepresented demographic groups sharing similar voice characteristics. In this work, we investigate the fairness of speaker verification models on controlled datasets with imbalanced gender distributions, providing direct evidence that model performance suffers for underrepresented groups. To mitigate this disparity we propose the group-adapted fusion network (GFN) architecture, a modular architecture based on group embedding adaptation and score fusion. We show that our method alleviates model unfairness by improving speaker verification both overall and for individual groups. Given imbalanced group representation in training, our proposed method achieves overall equal error rate (EER) reduction of 9.6 29.0 20.0 applicable to other types of training data skew in speaker recognition systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2021

Improving Fairness in Speaker Recognition

The human voice conveys unique characteristics of an individual, making ...
research
07/15/2022

Adversarial Reweighting for Speaker Verification Fairness

We address performance fairness for speaker verification using the adver...
research
08/05/2023

Elucidate Gender Fairness in Singing Voice Transcription

It is widely known that males and females typically possess different so...
research
04/27/2022

Study on the Fairness of Speaker Verification Systems on Underrepresented Accents in English

Speaker verification (SV) systems are currently being used to make sensi...
research
07/26/2021

SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification

Despite the success of deep neural networks (DNNs) in enabling on-device...
research
06/11/2020

Adaptive Sampling to Reduce Disparate Performance

Existing methods for reducing disparate performance of a classifier acro...
research
07/08/2022

Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification

Speaker identification (SID) in the household scenario (e.g., for smart ...

Please sign up or login with your details

Forgot password? Click here to reset