LIP: Lightweight Intelligent Preprocessor for meaningful text-to-speech

by   Harshvardhan Anand, et al.

Existing Text-to-Speech (TTS) systems need to read messages from the email which may have Personal Identifiable Information (PII) to text messages that can have a streak of emojis and punctuation. 92 population use emoji with more than 10 billion emojis sent everyday. Lack of preprocessor leads to messages being read as-is including punctuation and infographics like emoticons. This problem worsens if there is a continuous sequence of punctuation/emojis that are quite common in real-world communications like messaging, Social Networking Site (SNS) interactions, etc. In this work, we aim to introduce a lightweight intelligent preprocessor (LIP) that can enhance the readability of a message before being passed downstream to existing TTS systems. We propose multiple sub-modules including: expanding contraction, censoring swear words, and masking of PII, as part of our preprocessor to enhance the readability of text. With a memory footprint of only 3.55 MB and inference time of 4 ms for up to 50-character text, our solution is suitable for real-time deployment. This work being the first of its kind, we try to benchmark with an open independent survey, the result of which shows 76.5 TTS.


page 6

page 7

page 8

page 9

page 10


STRIDE : Scene Text Recognition In-Device

Optical Character Recognition (OCR) systems have been widely used in var...

Modeling Time to Open of Emails with a Latent State for User Engagement Level

Email messages have been an important mode of communication, not only fo...

Towards an Effective Organization-Wide Bulk Email System

Bulk email is widely used in organizations to communicate messages to em...

VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation

Most of the speech recognition systems recover only words in the speech ...

Real-Time Text Detection and Recognition

Inrecentyears,ConvolutionalNeuralNet-work(CNN) is quite a popular topic,...

"Short is the Road that Leads from Fear to Hate": Fear Speech in Indian WhatsApp Groups

WhatsApp is the most popular messaging app in the world. Due to its popu...

A speech-based driver assisting module for Intelligent Transport System

Aim of this research is to transform images of roadside traffic panels t...

Please sign up or login with your details

Forgot password? Click here to reset