One-Shot Speaker Identification for a Service Robot using a CNN-based Generic Verifier

by   Ivette Vélez, et al.

In service robotics, there is an interest to identify the user by voice alone. However, in application scenarios where a service robot acts as a waiter or a store clerk, new users are expected to enter the environment frequently. Typically, speaker identification models need to be retrained when this occurs, which can take an impractical amount of time. In this paper, a new approach for speaker identification through verification has been developed using a Siamese Convolutional Neural Network architecture (SCNN), where it learns to generically verify if two audio signals are from the same speaker. By having an external database of recorded audio of the users, identification is carried out by verifying the speech input with each of its entries. If new users are encountered, it is only required to add their recorded audio to the external database to be able to be identified, without retraining. The system was evaluated in four different aspects: the performance of the verifier, the performance of the system as a classifier using clean audio, its speed, and its accuracy in real-life settings. Its performance in conjunction with its one-shot-learning capabilities, makes the proposed system a viable alternative for speaker identification for service robots.


page 5

page 7


Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

Although few-shot learning has attracted much attention from the fields ...

CASA-Based Speaker Identification Using Cascaded GMM-CNN Classifier in Noisy and Emotional Talking Conditions

This work aims at intensifying text-independent speaker identification p...

Speaker Identification in the Shouted Environment Using Suprasegmental Hidden Markov Models

In this paper, Suprasegmental Hidden Markov Models (SPHMMs) have been us...

Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion

Voice conversion (VC) techniques can be abused by malicious parties to t...

Lightweight Speaker Verification for Online Identification of New Speakers with Short Segments

Verifying if two audio segments belong to the same speaker has been rece...

A Study of Few-Shot Audio Classification

Advances in deep learning have resulted in state-of-the-art performance ...

Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset

Current authentication and trusted systems depend on classical and biome...

Please sign up or login with your details

Forgot password? Click here to reset