PerSpeechNorm: A Persian Toolkit for Speech Processing Normalization

11/01/2021
by   Romina Oji, et al.
0

In general, speech processing models consist of a language model along with an acoustic model. Regardless of the language model's complexity and variants, three critical pre-processing steps are needed in language models: cleaning, normalization, and tokenization. Among mentioned steps, the normalization step is so essential to format unification in pure textual applications. However, for embedded language models in speech processing modules, normalization is not limited to format unification. Moreover, it has to convert each readable symbol, number, etc., to how they are pronounced. To the best of our knowledge, there is no Persian normalization toolkits for embedded language models in speech processing modules, So in this paper, we propose an open-source normalization toolkit for text processing in speech applications. Briefly, we consider different readable Persian text like symbols (common currencies, #, @, URL, etc.), numbers (date, time, phone number, national code, etc.), and so on. Comparison with other available Persian textual normalization tools indicates the superiority of the proposed method in speech processing. Also, comparing the model's performance for one of the proposed functions (sentence separation) with other common natural language libraries such as HAZM and Parsivar indicates the proper performance of the proposed method. Besides, its evaluation of some Persian Wikipedia data confirms the proper performance of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2015

Normalization of Non-Standard Words in Croatian Texts

This paper presents text normalization which is an integral part of any ...
research
06/22/2023

AudioPaLM: A Large Language Model That Can Speak and Listen

We introduce AudioPaLM, a large language model for speech understanding ...
research
05/22/2023

Textually Pretrained Speech Language Models

Speech language models (SpeechLMs) process and generate acoustic data on...
research
09/14/2019

NeMo: a toolkit for building AI applications using Neural Modules

NeMo (Neural Modules) is a Python framework-agnostic toolkit for creatin...
research
06/13/2019

Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise

Speech processing systems rely on robust feature extraction to handle ph...
research
08/17/2021

Modulating Language Models with Emotions

Generating context-aware language that embodies diverse emotions is an i...
research
06/25/2020

Normalizing Text using Language Modelling based on Phonetics and String Similarity

Social media networks and chatting platforms often use an informal versi...

Please sign up or login with your details

Forgot password? Click here to reset