Real-time Neural-based Input Method

10/19/2018
by   Jiali Yao, et al.
0

The input method is an essential service on every mobile and desktop devices that provides text suggestions. It converts sequential keyboard inputs to the characters in its target language, which is indispensable for Japanese and Chinese users. Due to critical resource constraints and limited network bandwidth of the target devices, applying neural models to input method is not well explored. In this work, we apply a LSTM-based language model to input method and evaluate its performance for both prediction and conversion tasks with Japanese BCCWJ corpus. We articulate the bottleneck to be the slow softmax computation during conversion. To solve the issue, we propose incremental softmax approximation approach, which computes softmax with a selected subset vocabulary and fix the stale probabilities when the vocabulary is updated in future steps. We refer to this method as incremental selective softmax. The results show a two order speedup for the softmax computation when converting Japanese input sequences with a large vocabulary, reaching real-time speed on commodity CPU. We also exploit the model compressing potential to achieve a 92 model size reduction without losing accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2018

Neural-based Pinyin-to-Character Conversion with Adaptive Vocabulary

Pinyin-to-character (P2C) conversion is the core component of pinyin-bas...
research
06/18/2018

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

Model compression is essential for serving large deep neural nets on dev...
research
10/29/2018

Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

Neural language models have been widely used in various NLP tasks, inclu...
research
02/28/2019

Efficient Contextual Representation Learning Without Softmax Layer

Contextual representation models have achieved great success in improvin...
research
09/14/2016

Efficient softmax approximation for GPUs

We propose an approximate strategy to efficiently train neural network b...
research
01/24/2019

Glass+Skin: An Empirical Evaluation of the Added Value of Finger Identification to Basic Single-Touch Interaction on Touch Screens

The usability of small devices such as smartphones or interactive watche...
research
02/21/2019

Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities

The softmax function on top of a final linear layer is the de facto meth...

Please sign up or login with your details

Forgot password? Click here to reset