AutoSpeech: Neural Architecture Search for Speaker Recognition

05/07/2020
by   Shaojin Ding, et al.
1

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet. However, these backbones were originally proposed for image classification, and therefore may not be naturally fit for speaker recognition. Due to the prohibitive complexity of manually exploring the design space, we propose the first neural architecture search approach approach for the speaker recognition tasks, named as AutoSpeech. Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times. The final speaker recognition model can be obtained by training the derived CNN model through the standard scheme. To evaluate the proposed approach, we conduct experiments on both speaker identification and speaker verification tasks using the VoxCeleb1 dataset. Results demonstrate that the derived CNN architectures from the proposed approach significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 back-bones, while enjoying lower model complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2020

Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification

State-of-the-art speaker verification models are based on deep learning ...
research
03/25/2021

EfficientTDNN: Efficient Architecture Search for Speaker Recognition in the Wild

Speaker recognition refers to audio biometrics that utilizes acoustic ch...
research
05/03/2021

Heart-Darts: Classification of Heartbeats Using Differentiable Architecture Search

Arrhythmia is a cardiovascular disease that manifests irregular heartbea...
research
04/08/2022

Reliable Visualization for Deep Speaker Recognition

In spite of the impressive success of convolutional neural networks (CNN...
research
10/26/2019

Sum-Product Networks for Robust Automatic Speaker Recognition

The performance of a speaker recognition system degrades considerably in...
research
08/15/2021

CONet: Channel Optimization for Convolutional Neural Networks

Neural Architecture Search (NAS) has shifted network design from using h...
research
09/11/2020

Optimizing Convolutional Neural Network Architecture via Information Field

CNN architecture design has attracted tremendous attention of improving ...

Please sign up or login with your details

Forgot password? Click here to reset