Pronunciation recognition of English phonemes /@/, /æ/, /A:/ and /2/ using Formants and Mel Frequency Cepstral Coefficients

02/23/2017
by   Keith Y. Patarroyo, et al.
0

The Vocal Joystick Vowel Corpus, by Washington University, was used to study monophthongs pronounced by native English speakers. The objective of this study was to quantitatively measure the extent at which speech recognition methods can distinguish between similar sounding vowels. In particular, the phonemes /@/, /æ/, /A:/ and /2/ were analysed. 748 sound files from the corpus were used and subjected to Linear Predictive Coding (LPC) to compute their formants, and to Mel Frequency Cepstral Coefficients (MFCC) algorithm, to compute the cepstral coefficients. A Decision Tree Classifier was used to build a predictive model that learnt the patterns of the two first formants measured in the data set, as well as the patterns of the 13 cepstral coefficients. An accuracy of 70% was achieved using formants for the mentioned phonemes. For the MFCC analysis an accuracy of 52 % was achieved and an accuracy of 71% when /@/ was ignored. The results obtained show that the studied algorithms are far from mimicking the ability of distinguishing subtle differences in sounds like human hearing does.

READ FULL TEXT
research
06/14/2022

Frequency-centroid features for word recognition of non-native English speakers

The objective of this work is to investigate complementary features whic...
research
12/08/2021

A study on native American English speech recognition by Indian listeners with varying word familiarity level

In this study, listeners of varied Indian nativities are asked to listen...
research
10/06/2017

The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments

This paper introduces the contents and the possible usage of the DIRHA-E...
research
02/25/2016

Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection

Systems based on automatic speech recognition (ASR) technology can provi...
research
03/07/2022

Speaker recognition by means of a combination of linear and nonlinear predictive models

This paper deals the combination of nonlinear predictive models with cla...

Please sign up or login with your details

Forgot password? Click here to reset