Neural Networks for Text Correction and Completion in Keyboard Decoding

09/19/2017
by   Shaona Ghosh, et al.
0

Despite the ubiquity of mobile and wearable text messaging applications, the problem of keyboard text decoding is not tackled sufficiently in the light of the enormous success of the deep learning Recurrent Neural Network (RNN) and Convolutional Neural Networks (CNN) for natural language understanding. In particular, considering that the keyboard decoders should operate on devices with memory and processor resource constraints, makes it challenging to deploy industrial scale deep neural network (DNN) models. This paper proposes a sequence-to-sequence neural attention network system for automatic text correction and completion. Given an erroneous sequence, our model encodes character level hidden representations and then decodes the revised sequence thus enabling auto-correction and completion. We achieve this by a combination of character level CNN and gated recurrent unit (GRU) encoder along with and a word level gated recurrent unit (GRU) attention decoder. Unlike traditional language models that learn from billions of words, our corpus size is only 12 million words; an order of magnitude smaller. The memory footprint of our learnt model for inference and prediction is also an order of magnitude smaller than the conventional language model based text decoders. We report baseline performance for neural keyboard decoders in such limited domain. Our models achieve a word level accuracy of 90% and a character error rate CER of 2.4% over the Twitter typo dataset. We present a novel dataset of noisy to corrected mappings by inducing the noise distribution from the Twitter data over the OpenSubtitles 2009 dataset; on which our model predicts with a word level accuracy of 98% and sequence accuracy of 68.9%. In our user study, our model achieved an average CER of 2.6% with the state-of-the-art non-neural touch-screen keyboard decoder at CER of 1.6%.

READ FULL TEXT

page 4

page 5

page 6

page 9

page 11

research
08/12/2020

Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text

This research approaches the task of handwritten text with attention enc...
research
03/31/2016

Neural Language Correction with Character-Based Attention

Natural language correction has the potential to help language learners ...
research
06/06/2016

Gated Word-Character Recurrent Language Model

We introduce a recurrent neural network language model (RNN-LM) with lon...
research
06/21/2021

An End-to-End Khmer Optical Character Recognition using Sequence-to-Sequence with Attention

This paper presents an end-to-end deep convolutional recurrent neural ne...
research
07/11/2022

A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

Cursive handwritten text recognition is a challenging research problem i...
research
07/24/2017

Character-level Intra Attention Network for Natural Language Inference

Natural language inference (NLI) is a central problem in language unders...
research
05/24/2019

On Recurrent Neural Networks for Sequence-based Processing in Communications

In this work, we analyze the capabilities and practical limitations of n...

Please sign up or login with your details

Forgot password? Click here to reset