Multi-query multi-head attention pooling and Inter-topK penalty for speaker verification

10/11/2021
by   Miao Zhao, et al.
0

This paper describes the multi-query multi-head attention (MQMHA) pooling and inter-topK penalty methods which were first proposed in our submitted system description for VoxCeleb speaker recognition challenge (VoxSRC) 2021. Most multi-head attention pooling mechanisms either attend to the whole feature through multiple heads or attend to several split parts of the whole feature. Our proposed MQMHA combines both these two mechanisms and gain more diversified information. The margin-based softmax loss functions are commonly adopted to obtain discriminative speaker representations. To further enhance the inter-class discriminability, we propose a method that adds an extra inter-topK penalty on some confused speakers. By adopting both the MQMHA and inter-topK penalty, we achieved state-of-the-art performance in all of the public VoxCeleb test sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2020

Double Multi-Head Attention for Speaker Verification

Most state-of-the-art Deep Learning systems for speaker verification are...
research
08/21/2018

Exploring a Unified Attention-Based Pooling Framework for Speaker Verification

The pooling layer is an essential component in the neural network based ...
research
07/14/2021

Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding

This paper proposes a serialized multi-layer multi-head attention for ne...
research
02/14/2021

Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

This paper proposes a neural network architecture for tackling the query...
research
10/12/2022

THUEE system description for NIST 2020 SRE CTS challenge

This paper presents the system description of the THUEE team for the NIS...
research
11/26/2019

Low Rank Factorization for Compact Multi-Head Self-Attention

Effective representation learning from text has been an active area of r...
research
05/03/2022

Efficient dynamic filter for robust and low computational feature extraction

Unseen noise signal which is not considered in a model training process ...

Please sign up or login with your details

Forgot password? Click here to reset