Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset

11/11/2021
by   Aly Moustafa, et al.
0

Current authentication and trusted systems depend on classical and biometric methods to recognize or authorize users. Such methods include audio speech recognitions, eye, and finger signatures. Recent tools utilize deep learning and transformers to achieve better results. In this paper, we develop a deep learning constructed model for Arabic speakers identification by using Wav2Vec2.0 and HuBERT audio representation learning tools. The end-to-end Wav2Vec2.0 paradigm acquires contextualized speech representations learnings by randomly masking a set of feature vectors, and then applies a transformer neural network. We employ an MLP classifier that is able to differentiate between invariant labeled classes. We show several experimental results that safeguard the high accuracy of the proposed model. The experiments ensure that an arbitrary wave signal for a certain speaker can be identified with 98 97.1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2022

Speaker Identification using Speech Recognition

The audio data is increasing day by day throughout the globe with the in...
research
08/27/2022

Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization

This paper describes a speaker diarization model based on target speaker...
research
10/23/2022

Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG

Speech signals are subjected to more acoustic interference and emotional...
research
10/09/2021

Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Recently, there have been tremendous research outcomes in the fields of ...
research
01/09/2022

Emotional Speaker Identification using a Novel Capsule Nets Model

Speaker recognition systems are widely used in various applications to i...
research
09/11/2018

One-Shot Speaker Identification for a Service Robot using a CNN-based Generic Verifier

In service robotics, there is an interest to identify the user by voice ...
research
04/06/2023

DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection

Tools to generate high quality synthetic speech signal that is perceptua...

Please sign up or login with your details

Forgot password? Click here to reset