A New Approach to Accent Recognition and Conversion for Mandarin Chinese

by   Lin Ai, et al.

Two new approaches to accent classification and conversion are presented and explored, respectively. The first topic is Chinese accent classification/recognition. The second topic is the use of encoder-decoder models for end-to-end Chinese accent conversion, where the classifier in the first topic is used for the training of the accent converter encoder-decoder model. Experiments using different features and model are performed for accent recognition. These features include MFCCs and spectrograms. The classifier models were TDNN and 1D-CNN. On the MAGICDATA dataset with 5 classes of accents, the TDNN classifier trained on MFCC features achieved a test accuracy of 54 spectrograms achieve a test accuracy of 62 prototype of an end-to-end accent converter model is also presented. The converter model comprises of an encoder and a decoder. The encoder model converts an accented input into an accent-neutral form. The decoder model converts an accent-neutral form to an accented form with the specified accent assigned by the input accent label. The converter prototype preserves the tone and foregoes the details in the output audio. An encoder-decoder structure demonstrates the potential of being an effective accent converter. A proposal for future improvements is also presented to address the issue of lost details in the decoder output.


ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder

This paper proposes a non-parallel many-to-many voice conversion (VC) me...

Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition

End-to-end training of deep learning-based models allows for implicit le...

Generating Chinese Classical Poems with RNN Encoder-Decoder

We take the generation of Chinese classical poem lines as a sequence-to-...

An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text

Standard methods for multi-label text classification largely rely on enc...

Cascade Decoder: A Universal Decoding Method for Biomedical Image Segmentation

The Encoder-Decoder architecture is a main stream deep learning model fo...

Oil Spill Segmentation using Deep Encoder-Decoder models

Crude oil is an integral component of the modern world economy. With the...

StegaPos: Preventing Crops and Splices with Imperceptible Positional Encodings

We present a model for differentiating between images that are authentic...

Please sign up or login with your details

Forgot password? Click here to reset