Parameter-Free Attentive Scoring for Speaker Verification

03/10/2022
by   Jason Pelecanos, et al.
0

This paper presents a novel study of parameter-free attentive scoring for speaker verification. Parameter-free scoring provides the flexibility of comparing speaker representations without the need of an accompanying parametric scoring model. Inspired by the attention component in Transformer neural networks, we propose a variant of the scaled dot product attention mechanism to compare enrollment and test segment representations. In addition, this work explores the effect on performance of (i) different types of normalization, (ii) independent versus tied query/key estimation, (iii) varying the number of key-value pairs and (iv) pooling multiple enrollment utterance statistics. Experimental results for a 4 task average show that a simple parameter-free attentive scoring mechanism can improve the average EER by 10 over the best cosine similarity baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2018

Phonetic-attention scoring for deep speaker features in speaker verification

Recent studies have shown that frame-level deep speaker features can be ...
research
03/29/2018

Attentive Statistics Pooling for Deep Speaker Embedding

This paper proposes attentive statistics pooling for deep speaker embedd...
research
04/04/2021

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances

A back-end model is a key element of modern speaker verification systems...
research
10/22/2020

Graph Attention Networks for Speaker Verification

This work presents a novel back-end framework for speaker verification u...
research
04/22/2022

Unifying Cosine and PLDA Back-ends for Speaker Verification

State-of-art speaker verification (SV) systems use a back-end model to s...
research
06/01/2023

Speaker verification using attentive multi-scale convolutional recurrent network

In this paper, we propose a speaker verification method by an Attentive ...
research
02/08/2019

Speaker diarisation using 2D self-attentive combination of embeddings

Speaker diarisation systems often cluster audio segments using speaker e...

Please sign up or login with your details

Forgot password? Click here to reset