Goodness of Pronunciation Pipelines for OOV Problem

09/08/2022
by   Ankit Grover, et al.
0

In the following report we propose pipelines for Goodness of Pronunciation (GoP) computation solving OOV problem at testing time using Vocab/Lexicon expansion techniques. The pipeline uses different components of ASR system to quantify accent and automatically evaluate them as scores. We use the posteriors of an ASR model trained on native English speech, along with the phone level boundaries to obtain phone level pronunciation scores. We used this as a baseline pipeline and implemented methods to remove UNK and SPN phonemes in the GoP output by building three pipelines. The Online, Offline and Hybrid pipeline which returns the scores but also can prevent unknown words in the final output. The Online method is based per utterance, Offline method pre-incorporates a set of OOV words for a given data set and the Hybrid method combines the above two ideas to expand the lexicon as well work per utterance. We further provide utilities such as the Phoneme to posterior mappings, GoP scores of each utterance as a vector, and Word boundaries used in the GoP pipeline for use in future research.

READ FULL TEXT

page 19

page 24

page 35

research
06/05/2020

ELITR Non-Native Speech Translation at IWSLT 2020

This paper is an ELITR system submission for the non-native speech trans...
research
05/26/2023

DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction

Conversational speech often consists of deviations from the speech plan,...
research
11/04/2020

Frustratingly Easy Noise-aware Training of Acoustic Models

Environmental noises and reverberation have a detrimental effect on the ...
research
03/28/2022

Filler Word Detection and Classification: A Dataset and Benchmark

Filler words such as `uh' or `um' are sounds or words people use to sign...
research
10/26/2022

UFO2: A unified pre-training framework for online and offline speech recognition

In this paper, we propose a Unified pre-training Framework for Online an...
research
06/30/2021

Sequence-level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models

Scores from traditional confidence classifiers (CCs) in automatic speech...
research
08/31/2018

AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale

AISHELL-1 is by far the largest open-source speech corpus available for ...

Please sign up or login with your details

Forgot password? Click here to reset