DeepAI AI Chat
Log In Sign Up

Siamese Capsule Network for End-to-End Speaker Recognition In The Wild

by   Amirhossein Hajavi, et al.

We propose an end-to-end deep model for speaker verification in the wild. Our model uses thin-ResNet for extracting speaker embeddings from utterances and a Siamese capsule network and dynamic routing as the Back-end to calculate a similarity score between the embeddings. We conduct a series of experiments and comparisons on our model to state-of-the-art solutions, showing that our model outperforms all the other models using substantially less amount of training data. We also perform additional experiments to study the impact of different speaker embeddings on the Siamese capsule network. We show that the best performance is achieved by using embeddings obtained directly from the feature aggregation module of the Front-end and passing them to higher capsules using dynamic routing.


page 1

page 2

page 3

page 4


Neural PLDA Modeling for End-to-End Speaker Verification

While deep learning models have made significant advances in supervised ...

Utterance-level Aggregation For Speaker Recognition In The Wild

The objective of this paper is speaker recognition "in the wild"-where u...

Analysis of Length Normalization in End-to-End Speaker Verification System

The classical i-vectors and the latest end-to-end deep speaker embedding...

Few Shot Speaker Recognition using Deep Neural Networks

The recent advances in deep learning are mostly driven by availability o...

The UPC Speaker Verification System Submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20)

This report describes the submission from Technical University of Catalo...

Siamese Capsule Networks

Capsule Networks have shown encouraging results on defacto benchmark com...

Siamese x-vector reconstruction for domain adapted speaker recognition

With the rise of voice-activated applications, the need for speaker reco...