Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada

07/27/2022
by   Madhavaraj A, et al.
0

We present automatic speech recognition (ASR) systems for Tamil and Kannada based on subword modeling to effectively handle unlimited vocabulary due to the highly agglutinative nature of the languages. We explore byte pair encoding (BPE), and proposed a variant of this algorithm named extended-BPE, and Morfessor tool to segment each word as subwords. We have effectively incorporated maximum likelihood (ML) and Viterbi estimation techniques with weighted finite state transducers (WFST) framework in these algorithms to learn the subword dictionary from a large text corpus. Using the learnt subword dictionary, the words in training data transcriptions are segmented to subwords and we train deep neural network ASR systems which recognize subword sequence for any given test speech utterance. The output subword sequence is then post-processed using deterministic rules to get the final word sequence such that the actual number of words that can be recognized is much larger. For Tamil ASR, We use 152 hours of data for training and 65 hours for testing, whereas for Kannada ASR, we use 275 hours for training and 72 hours for testing. Upon experimenting with different combination of segmentation and estimation techniques, we find that the word error rate (WER) reduces drastically when compared to the baseline word-level ASR, achieving a maximum absolute WER reduction of 6.24

READ FULL TEXT
research
07/27/2022

Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada

In this paper, we present specially designed automatic speech recognitio...
research
07/05/2021

Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition

Neural sequence-to-sequence systems deliver state-of-the-art performance...
research
08/20/2023

Indonesian Automatic Speech Recognition with XLSR-53

This study focuses on the development of Indonesian Automatic Speech Rec...
research
03/11/2023

Transcription free filler word detection with Neural semi-CRFs

Non-linguistic filler words, such as "uh" or "um", are prevalent in spon...
research
03/29/2022

Short-Term Word-Learning in a Dynamically Changing Environment

Neural sequence-to-sequence automatic speech recognition (ASR) systems a...
research
07/09/2018

Foreign English Accent Adjustment by Learning Phonetic Patterns

State-of-the-art automatic speech recognition (ASR) systems struggle wit...
research
11/08/2015

Towards Structured Deep Neural Network for Automatic Speech Recognition

In this paper we propose the Structured Deep Neural Network (structured ...

Please sign up or login with your details

Forgot password? Click here to reset