VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation

12/22/2021
by   Sumit Kumar, et al.
0

Most of the speech recognition systems recover only words in the speech and fail to capture emotions. Users have to manually add emoji(s) in text for adding tone and making communication fun. Though there is much work done on punctuation addition on transcribed speech, the area of emotion addition is untouched. In this paper, we propose a novel on-device pipeline to enrich the voice input experience. It involves, given a blob of transcribed text, intelligently processing and identifying structure where emoji insertion makes sense. Moreover, it includes semantic text analysis to predict emoji for each of the sub-parts for which we propose a novel architecture Attention-based Char Aware (ACA) LSTM which handles Out-Of-Vocabulary (OOV) words as well. All these tasks are executed completely on-device and hence can aid on-device dictation systems. To the best of our knowledge, this is the first work that shows how to add emoji(s) in the transcribed text. We demonstrate that our components achieve comparable results to previous neural approaches for punctuation addition and emoji prediction with 80 model has a very small memory footprint of a mere 4MB to suit on-device deployment.

READ FULL TEXT

page 1

page 3

research
02/25/2020

A.I. based Embedded Speech to Text Using Deepspeech

Deepspeech was very useful for development IoT devices that need voice r...
research
06/10/2022

AHD ConvNet for Speech Emotion Classification

Accomplishments in the field of artificial intelligence are utilized in ...
research
10/18/2018

EdgeSpeechNets: Highly Efficient Deep Neural Networks for Speech Recognition on the Edge

Despite showing state-of-the-art performance, deep learning for speech r...
research
10/29/2020

DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device Text-to-Speech

With the number of smart devices increasing, the demand for on-device te...
research
07/11/2022

LIP: Lightweight Intelligent Preprocessor for meaningful text-to-speech

Existing Text-to-Speech (TTS) systems need to read messages from the ema...
research
10/13/2016

A Survey of Voice Translation Methodologies - Acoustic Dialect Decoder

Speech Translation has always been about giving source text or audio inp...
research
12/04/2020

On-Device Sentence Similarity for SMS Dataset

Determining the sentence similarity between Short Message Service (SMS) ...

Please sign up or login with your details

Forgot password? Click here to reset