Context-aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training

08/19/2020
by   Jiatong Shi, et al.
0

Mispronunciation detection is an essential component of the Computer-Assisted Pronunciation Training (CAPT) systems. State-of-the-art mispronunciation detection models use Deep Neural Networks (DNN) for acoustic modeling, and a Goodness of Pronunciation (GOP) based algorithm for pronunciation scoring. However, GOP based scoring models have two major limitations: i.e., (i) They depend on forced alignment which splits the speech into phonetic segments and independently use them for scoring, which neglects the transitions between phonemes within the segment; (ii) They only focus on phonetic segments, which fails to consider the context effects across phonemes (such as liaison, omission, incomplete plosive sound, etc.). In this work, we propose the Context-aware Goodness of Pronunciation (CaGOP) scoring model. Particularly, two factors namely the transition factor and the duration factor are injected into CaGOP scoring. The transition factor identifies the transitions between phonemes and applies them to weight the frame-wise GOP. Moreover, a self-attention based phonetic duration modeling is proposed to introduce the duration factor into the scoring model. The proposed scoring model significantly outperforms baselines, achieving 20 and 12 sentence-level mispronunciation detection respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2019

Selective Attention for Context-aware Neural Machine Translation

Despite the progress made in sentence-level NMT, current systems still f...
research
08/26/2021

Towards Robust Mispronunciation Detection and Diagnosis for L2 English Learners with Accent-Modulating Methods

With the acceleration of globalization, more and more people are willing...
research
05/20/2020

Context-Aware Learning to Rank with Self-Attention

In learning to rank, one is interested in optimising the global ordering...
research
12/16/2022

Context-aware Fine-tuning of Self-supervised Speech Models

Self-supervised pre-trained transformers have improved the state of the ...
research
05/25/2022

Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling

Sentence scoring aims at measuring the likelihood score of a sentence an...
research
12/31/2019

CASE: Context-Aware Semantic Expansion

In this paper, we define and study a new task called Context-Aware Seman...
research
08/16/2018

Context-Aware DFM Rule Analysis and Scoring Using Machine Learning

To evaluate the quality of physical layout designs in terms of manufactu...

Please sign up or login with your details

Forgot password? Click here to reset