An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning

09/16/2022
by   Tushar Talukder Showrav, et al.
0

An independent, automated method of decoding and transcribing oral speech is known as automatic speech recognition (ASR). A typical ASR system extracts feature from audio recordings or streams and run one or more algorithms to map the features to corresponding texts. Numerous of research has been done in the field of speech signal processing in recent years. When given adequate resources, both conventional ASR and emerging end-to-end (E2E) speech recognition have produced promising results. However, for low-resource languages like Bengali, the current state of ASR lags behind, although the low resource state does not reflect upon the fact that this language is spoken by over 500 million people all over the world. Despite its popularity, there aren't many diverse open-source datasets available, which makes it difficult to conduct research on Bengali speech recognition systems. This paper is a part of the competition named `BUET CSE Fest DL Sprint'. The purpose of this paper is to improve the speech recognition performance of the Bengali language by adopting speech recognition technology on the E2E structure based on the transfer learning framework. The proposed method effectively models the Bengali language and achieves 3.819 score in `Levenshtein Mean Distance' on the test dataset of 7747 samples, when only 1000 samples of train dataset were used to train.

READ FULL TEXT
research
10/11/2022

Automatic Speech Recognition of Low-Resource Languages Based on Chukchi

The following paper presents a project focused on the research and creat...
research
03/31/2022

Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition

In the recent years end to end (E2E) automatic speech recognition (ASR) ...
research
11/21/2019

Cantonese Automatic Speech Recognition Using Transfer Learning from Mandarin

We propose a system to develop a basic automatic speech recognizer(ASR) ...
research
05/05/2021

Accent Recognition with Hybrid Phonetic Features

The performance of voice-controlled systems is usually influenced by acc...
research
04/27/2023

Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

Automatic speech recognition (ASR) has recently become an important chal...
research
02/06/2021

A bandit approach to curriculum generation for automatic speech recognition

The Automated Speech Recognition (ASR) task has been a challenging domai...
research
02/02/2023

Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition

Homophone characters are common in tonal syllable-based languages, such ...

Please sign up or login with your details

Forgot password? Click here to reset