Speaker Independent Continuous Speech to Text Converter for Mobile Application

07/19/2013
by   R. Sandanalakshmi, et al.
0

An efficient speech to text converter for mobile application is presented in this work. The prime motive is to formulate a system which would give optimum performance in terms of complexity, accuracy, delay and memory requirements for mobile environment. The speech to text converter consists of two stages namely front-end analysis and pattern recognition. The front end analysis involves preprocessing and feature extraction. The traditional voice activity detection algorithms which track only energy cannot successfully identify potential speech from input because the unwanted part of the speech also has some energy and appears to be speech. In the proposed system, VAD that calculates energy of high frequency part separately as zero crossing rate to differentiate noise from speech is used. Mel Frequency Cepstral Coefficient (MFCC) is used as feature extraction method and Generalized Regression Neural Network is used as recognizer. MFCC provides low word error rate and better feature extraction. Neural Network improves the accuracy. Thus a small database containing all possible syllable pronunciation of the user is sufficient to give recognition accuracy closer to 100 real time speaker independent applications like mobile phones, PDAs etc.

READ FULL TEXT
research
09/26/2022

Text Independent Speaker Identification System for Access Control

Even human intelligence system fails to offer 100 speeches from a specif...
research
08/27/2022

Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments

This research work is about recent development made in speech recognitio...
research
10/22/2020

The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge

This paper describes system setup of our submission to speaker diarisati...
research
01/14/2021

Speaker activity driven neural speech extraction

Target speech extraction, which extracts the speech of a target speaker ...
research
03/03/2018

SpeechPy - A Library for Speech Processing and Recognition

SpeechPy is an open source Python package that contains speech preproces...
research
02/08/2023

Masking Kernel for Learning Energy-Efficient Speech Representation

Modern smartphones are equipped with powerful audio hardware and process...
research
07/15/2023

Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned Features

Synthetic-voice cloning technologies have seen significant advances in r...

Please sign up or login with your details

Forgot password? Click here to reset