A Unified Transformer-based Framework for Duplex Text Normalization

08/23/2021
by   Tuan Manh Lai, et al.
0

Text normalization (TN) and inverse text normalization (ITN) are essential preprocessing and postprocessing steps for text-to-speech synthesis and automatic speech recognition, respectively. Many methods have been proposed for either TN or ITN, ranging from weighted finite-state transducers to neural networks. Despite their impressive performance, these methods aim to tackle only one of the two tasks but not both. As a result, in a complete spoken dialog system, two separate models for TN and ITN need to be built. This heterogeneity increases the technical complexity of the system, which in turn increases the cost of maintenance in a production setting. Motivated by this observation, we propose a unified framework for building a single neural duplex system that can simultaneously handle TN and ITN. Combined with a simple but effective data augmentation method, our systems achieve state-of-the-art results on the Google TN dataset for English and Russian. They can also reach over 95 additional fine-tuning. In addition, we also create a cleaned dataset from the Spoken Wikipedia Corpora for German and report the performance of our systems on the dataset. Overall, experimental results demonstrate the proposed duplex text normalization framework is highly effective and applicable to a range of domains and languages

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2021

NeMo Inverse Text Normalization: From Development To Production

Inverse text normalization (ITN) converts spoken-domain automatic speech...
research
10/26/2022

Four-in-One: A Joint Approach to Inverse Text Normalization, Punctuation, Capitalization, and Disfluency for Automatic Speech Recognition

Features such as punctuation, capitalization, and formatting of entities...
research
01/20/2023

Language Agnostic Data-Driven Inverse Text Normalization

With the emergence of automatic speech recognition (ASR) models, convert...
research
02/12/2021

Neural Inverse Text Normalization

While there have been several contributions exploring state of the art t...
research
03/31/2022

indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languages

Automatic Speech Recognition (ASR) generates text which is most of the t...
research
04/15/2021

Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Developing Text Normalization (TN) systems for Text-to-Speech (TTS) on n...
research
06/27/2023

Automatic Annotation of Direct Speech in Written French Narratives

The automatic annotation of direct speech (AADS) in written text has bee...

Please sign up or login with your details

Forgot password? Click here to reset