Persian Vowel recognition with MFCC and ANN on PCVC speech dataset

12/17/2018
by   Saber Malekzadeh, et al.
0

In this paper a new method for recognition of consonant-vowel phonemes combination on a new Persian speech dataset titled as PCVC (Persian Consonant-Vowel Combination) is proposed which is used to recognize Persian phonemes. In PCVC dataset, there are 20 sets of audio samples from 10 speakers which are combinations of 23 consonant and 6 vowel phonemes of Persian language. In each sample, there is a combination of one vowel and one consonant. First, the consonant phoneme is pronounced and just after it, the vowel phoneme is pronounced. Each sound sample is a frame of 2 seconds of audio. In every 2 seconds, there is an average of 0.5 second speech and the rest is silence. In this paper, the proposed method is the implementations of the MFCC (Mel Frequency Cepstrum Coefficients) on every partitioned sound sample. Then, every train sample of MFCC vector is given to a multilayer perceptron feed-forward ANN (Artificial Neural Network) for training process. At the end, the test samples are examined on ANN model for phoneme recognition. After training and testing process, the results are presented in recognition of vowels. Then, the average percent of recognition for vowel phonemes are computed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2013

Non-Correlated Character Recognition using Artificial Neural Network

This paper investigates a method of Handwritten English Character Recogn...
research
06/13/2020

GIPFA: Generating IPA Pronunciation from Audio

Transcribing spoken audio samples into International Phonetic Alphabet (...
research
02/02/2023

Simple method for detecting sleep episodes in rats ECoG using machine learning

In this paper we propose a new method for the automatic recognition of t...
research
12/14/2018

On Stacked Denoising Autoencoder based Pre-training of ANN for Isolated Handwritten Bengali Numerals Dataset Recognition

This work attempts to find the most optimal parameter setting of a deep ...
research
04/03/2013

Estimating Phoneme Class Conditional Probabilities from Raw Speech Signal using Convolutional Neural Networks

In hybrid hidden Markov model/artificial neural networks (HMM/ANN) autom...
research
05/11/2020

Exploring TTS without T Using Biologically/Psychologically Motivated Neural Network Modules (ZeroSpeech 2020)

In this study, we reported our exploration of Text-To-Speech without Tex...

Please sign up or login with your details

Forgot password? Click here to reset