Self-imitating Feedback Generation Using GAN for Computer-Assisted Pronunciation Training

04/20/2019
by   Seung Hee Yang, et al.
0

Self-imitating feedback is an effective and learner-friendly method for non-native learners in Computer-Assisted Pronunciation Training. Acoustic characteristics in native utterances are extracted and transplanted onto learner's own speech input, and given back to the learner as a corrective feedback. Previous works focused on speech conversion using prosodic transplantation techniques based on PSOLA algorithm. Motivated by the visual differences found in spectrograms of native and non-native speeches, we investigated applying GAN to generate self-imitating feedback by utilizing generator's ability through adversarial training. Because this mapping is highly under-constrained, we also adopt cycle consistency loss to encourage the output to preserve the global structure, which is shared by native and non-native utterances. Trained on 97,200 spectrogram images of short utterances produced by native and non-native speakers of Korean, the generator is able to successfully transform the non-native spectrogram input to a spectrogram with properties of self-imitating feedback. Furthermore, the transformed spectrogram shows segmental corrections that cannot be obtained by prosodic transplantation. Perceptual test comparing the self-imitating and correcting abilities of our method with the baseline PSOLA method shows that the generative approach with cycle consistency loss is promising.

READ FULL TEXT
research
12/08/2022

DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech

When beginners learn to speak a non-native language, it is difficult for...
research
01/10/2020

Improving Dysarthric Speech Intelligibility Using Cycle-consistent Adversarial Training

Dysarthria is a motor speech impairment affecting millions of people. Dy...
research
10/07/2021

Applying Phonological Features in Multilingual Text-To-Speech

This study investigates whether phonological features can be applied in ...
research
09/09/2021

'1e0a': A Computational Approach to Rhythm Training

We present a computational assessment system that promotes the learning ...
research
09/12/2020

Corrective feedback, emphatic speech synthesis, visual-speech exaggeration, pronunciation learning

To provide more discriminative feedback for the second language (L2) lea...
research
11/25/2020

Neural Representations for Modeling Variation in English Speech

Variation in speech is often represented and investigated using phonetic...
research
07/02/2022

Computer-assisted Pronunciation Training – Speech synthesis is almost all you need

The research community has long studied computer-assisted pronunciation ...

Please sign up or login with your details

Forgot password? Click here to reset