Multilingual and Multimode Phone Recognition System for Indian Languages

08/23/2019
by   Kumud Tripathi, et al.
0

The aim of this paper is to develop a flexible framework capable of automatically recognizing phonetic units present in a speech utterance of any language spoken in any mode. In this study, we considered two modes of speech: conversation, and read modes in four Indian languages, namely, Telugu, Kannada, Odia, and Bengali. The proposed approach consists of two stages: (1) Automatic speech mode classification (SMC) and (2) Automatic phonetic recognition using mode-specific multilingual phone recognition system (MPRS). In this work, the vocal tract and excitation source features are considered for speech mode classification (SMC) task. SMC systems are developed using multilayer perceptron (MLP). Further, vocal tract, excitation source, and tandem features are used to build the deep neural network (DNN)-based MPRSs. The performance of the proposed approach is compared with mode-dependent MPRSs. Experimental results show that the proposed approach which combines both SMC and MPRS into a single system outperforms the baseline mode-dependent MPRSs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2020

A multilingual approach to joint Speech and Accent Recognition with DNN-HMM framework

Human can perform multi-task recognition from speech. For instance, huma...
research
11/13/2017

Phonemic and Graphemic Multilingual CTC Based Speech Recognition

Training automatic speech recognition (ASR) systems requires large amoun...
research
09/06/2017

Spoken English Intelligibility Remediation with PocketSphinx Alignment and Feature Extraction Improves Substantially over the State of the Art

Automatic speech recognition is used to assess spoken English learner pr...
research
05/19/2022

Automatic Spoken Language Identification using a Time-Delay Neural Network

Closed-set spoken language identification is the task of recognizing the...
research
07/11/2021

Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings

The use of phonological features (PFs) potentially allows language-speci...
research
07/24/2021

Differentiable Allophone Graphs for Language-Universal Speech Recognition

Building language-universal speech recognition systems entails producing...
research
07/06/2018

Tone Recognition Using Lifters and CTC

In this paper, we present a new method for recognizing tones in continuo...

Please sign up or login with your details

Forgot password? Click here to reset