Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker Audio

04/07/2023
by   Jenthe Thienpondt, et al.
0

This paper is concerned with the task of speaker verification on audio with multiple overlapping speakers. Most speaker verification systems are designed with the assumption of a single speaker being present in a given audio segment. However, in a real-world setting this assumption does not always hold. In this paper, we demonstrate that current speaker verification systems are not robust against audio with noticeable speaker overlap. To alleviate this issue, we propose margin-mixup, a simple training strategy that can easily be adopted by existing speaker verification pipelines to make the resulting speaker embeddings robust against multi-speaker audio. In contrast to other methods, margin-mixup requires no alterations to regular speaker verification architectures, while attaining better results. On our multi-speaker test set based on VoxCeleb1, the proposed margin-mixup strategy improves the EER on average with 44.4 baseline systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2019

Target Speaker Extraction for Overlapped Multi-Talker Speaker Verification

The performance of speaker verification degrades significantly when the ...
research
06/24/2019

Who said that?: Audio-visual speaker diarisation of real-world meetings

The goal of this work is to determine 'who spoke when' in real-world mee...
research
07/13/2020

DNN Speaker Tracking with Embeddings

In multi-speaker applications is common to have pre-computed models from...
research
09/13/2023

Weakly-Supervised Multi-Task Learning for Audio-Visual Speaker Verification

In this paper, we present a methodology for achieving robust multimodal ...
research
03/06/2020

Lightweight Speaker Verification for Online Identification of New Speakers with Short Segments

Verifying if two audio segments belong to the same speaker has been rece...
research
08/14/2023

VoxBlink: A Large Scale Speaker Verification Dataset on Camera

In this paper, we introduce a large-scale and high-quality audio-visual ...
research
07/13/2022

Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings

Automatic speaker verification has achieved remarkable progress in recen...

Please sign up or login with your details

Forgot password? Click here to reset