Foreign English Accent Adjustment by Learning Phonetic Patterns

07/09/2018
by   Fedor Kitashov, et al.
0

State-of-the-art automatic speech recognition (ASR) systems struggle with the lack of data for rare accents. For sufficiently large datasets, neural engines tend to outshine statistical models in most natural language processing problems. However, a speech accent remains a challenge for both approaches. Phonologists manually create general rules describing a speaker's accent, but their results remain underutilized. In this paper, we propose a model that automatically retrieves phonological generalizations from a small dataset. This method leverages the difference in pronunciation between a particular dialect and General American English (GAE) and creates new accented samples of words. The proposed model is able to learn all generalizations that previously were manually obtained by phonologists. We use this statistical method to generate a million phonological variations of words from the CMU Pronouncing Dictionary and train a sequence-to-sequence RNN to recognize accented words with 59 accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2021

Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition

Neural sequence-to-sequence systems deliver state-of-the-art performance...
research
02/27/2020

SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation

We propose autoencoding speaker conversion for training data augmentatio...
research
09/13/2019

A Comparative Study on Transformer vs RNN in Speech Applications

Sequence-to-sequence models have been widely used in end-to-end speech p...
research
07/27/2022

Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada

We present automatic speech recognition (ASR) systems for Tamil and Kann...
research
04/03/2021

On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR

We propose an on-the-fly data augmentation method for automatic speech r...
research
10/12/2022

Can we use Common Voice to train a Multi-Speaker TTS system?

Training of multi-speaker text-to-speech (TTS) systems relies on curated...
research
06/12/2017

Acoustic data-driven lexicon learning based on a greedy pronunciation selection framework

Speech recognition systems for irregularly-spelled languages like Englis...

Please sign up or login with your details

Forgot password? Click here to reset