An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings

10/14/2021
by   Wenxuan Ye, et al.
0

Many mispronunciation detection and diagnosis (MD D) research approaches try to exploit both the acoustic and linguistic features as input. Yet the improvement of the performance is limited, partially due to the shortage of large amount annotated training data at the phoneme level. Phonetic embeddings, extracted from ASR models trained with huge amount of word level annotations, can serve as a good representation of the content of input speech, in a noise-robust and speaker-independent manner. These embeddings, when used as implicit phonetic supplementary information, can alleviate the data shortage of explicit phoneme annotations. We propose to utilize Acoustic, Phonetic and Linguistic (APL) embedding features jointly for building a more powerful MD&D system. Experimental results obtained on the L2-ARCTIC database show the proposed approach outperforms the baseline by 9.93 detection accuracy, diagnosis error rate and the F-measure, respectively.

READ FULL TEXT
research
02/15/2021

Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification

Intent classification is a task in spoken language understanding. An int...
research
02/12/2021

Content-Aware Speaker Embeddings for Speaker Diarisation

Recent speaker diarisation systems often convert variable length speech ...
research
11/05/2020

Semi-supervised Learning for Singing Synthesis Timbre

We propose a semi-supervised singing synthesizer, which is able to learn...
research
04/08/2022

Transducer-based language embedding for spoken language identification

The acoustic and linguistic features are important cues for the spoken l...
research
02/21/2023

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

Recent studies on pronunciation scoring have explored the effect of intr...
research
04/13/2021

On the Impact of Knowledge-based Linguistic Annotations in the Quality of Scientific Embeddings

In essence, embedding algorithms work by optimizing the distance between...

Please sign up or login with your details

Forgot password? Click here to reset