Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

04/24/2022
by   Yanxiong Li, et al.
0

Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification. In the task of few-shot learning, overfitting is a tough problem mainly due to the mismatch between training and testing conditions. In this paper, we propose a few-shot speaker identification method which can alleviate the overfitting problem. In the proposed method, the model of a depthwise separable convolutional network with channel attention is trained with a prototypical loss function. Experimental datasets are extracted from three public speech corpora: Aishell-2, VoxCeleb1 and TORGO. Experimental results show that the proposed method exceeds state-of-the-art methods for few-shot speaker identification in terms of accuracy and F-score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2022

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

Several recently proposed text-to-speech (TTS) models achieved to genera...
research
09/11/2018

One-Shot Speaker Identification for a Service Robot using a CNN-based Generic Verifier

In service robotics, there is an interest to identify the user by voice ...
research
12/02/2020

A Study of Few-Shot Audio Classification

Advances in deep learning have resulted in state-of-the-art performance ...
research
06/01/2023

Speaker verification using attentive multi-scale convolutional recurrent network

In this paper, we propose a speaker verification method by an Attentive ...
research
02/16/2021

Semi Supervised Learning For Few-shot Audio Classification By Episodic Triplet Mining

Few-shot learning aims to generalize unseen classes that appear during t...
research
05/12/2020

AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN

This paper investigates how to leverage a DurIAN-based average model to ...
research
03/09/2023

Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation

Audio-driven talking face has attracted broad interest from academia and...

Please sign up or login with your details

Forgot password? Click here to reset