Transformer based Grapheme-to-Phoneme Conversion

04/14/2020
by   Sevinj Yolchuyeva, et al.
0

Attention mechanism is one of the most successful techniques in deep learning based Natural Language Processing (NLP). The transformer network architecture is completely based on attention mechanisms, and it outperforms sequence-to-sequence models in neural machine translation without recurrent and convolutional layers. Grapheme-to-phoneme (G2P) conversion is a task of converting letters (grapheme sequence) to their pronunciations (phoneme sequence). It plays a significant role in text-to-speech (TTS) and automatic speech recognition (ASR) systems. In this paper, we investigate the application of transformer architecture to G2P conversion and compare its performance with recurrent and convolutional neural network based approaches. Phoneme and word error rates are evaluated on the CMUDict dataset for US English and the NetTalk dataset. The results show that transformer based G2P outperforms the convolutional-based approach in terms of word error rate and our results significantly exceeded previous recurrent approaches (without attention) regarding word and phoneme error rates on both datasets. Furthermore, the size of the proposed model is much smaller than the size of the previous approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2018

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Sequence-to-sequence attention-based models have recently shown very pro...
research
04/20/2020

WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network

In this article, we investigate whispered-to natural-speech conversion m...
research
09/24/2022

A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional Conversion

Cyrillic and Traditional Mongolian are the two main members of the Mongo...
research
02/22/2019

Fast Multi-language LSTM-based Online Handwriting Recognition

We describe an online handwriting system that is able to support 102 lan...
research
08/02/2022

Multi-Module G2P Converter for Persian Focusing on Relations between Words

In this paper, we investigate the application of end-to-end and multi-mo...
research
03/02/2023

LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion

As a key component of automated speech recognition (ASR) and the front-e...
research
04/06/2019

Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion

Grapheme-to-phoneme (G2P) conversion is an important task in automatic s...

Please sign up or login with your details

Forgot password? Click here to reset