Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition

06/05/2023
by   Jisung Wang, et al.
0

The limited availability of non-native speech datasets presents a major challenge in automatic speech recognition (ASR) to narrow the performance gap between native and non-native speakers. To address this, the focus of this study is on the efficient incorporation of the L2 phonemes, which in this work refer to Korean phonemes, through articulatory feature analysis. This not only enables accurate modeling of pronunciation variants but also allows for the utilization of both native Korean and English speech datasets. We employ the lattice-free maximum mutual information (LF-MMI) objective in an end-to-end manner, to train the acoustic model to align and predict one of multiple pronunciation candidates. Experimental results show that the proposed method improves ASR accuracy for Korean L2 speech by training solely on L1 speech data. Furthermore, fine-tuning on L2 speech improves recognition accuracy for both L1 and L2 speech without performance trade-offs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2022

Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding

ASR systems designed for native English (L1) usually underperform on non...
research
07/04/2018

Investigating the role of L1 in automatic pronunciation evaluation of L2 speech

Automatic pronunciation evaluation plays an important role in pronunciat...
research
05/25/2023

INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition

Automatic Speech Recognition (ASR) systems have attained unprecedented p...
research
02/25/2016

Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection

Systems based on automatic speech recognition (ASR) technology can provi...
research
06/16/2020

Quantization of Acoustic Model Parameters in Automatic Speech Recognition Framework

Robust automatic speech recognition (ASR) system exploits state-of-the-a...
research
10/19/2021

AequeVox: Automated Fairness Testing of Speech Recognition Systems

Automatic Speech Recognition (ASR) systems have become ubiquitous. They ...
research
01/31/2018

Comparing approaches for mitigating intergroup variability in personality recognition

Personality have been found to predict many life outcomes, and there hav...

Please sign up or login with your details

Forgot password? Click here to reset