No Padding Please: Efficient Neural Handwriting Recognition

Neural handwriting recognition (NHR) is the recognition of handwritten text with deep learning models, such as multi-dimensional long short-term memory (MDLSTM) recurrent neural networks. Models with MDLSTM layers have achieved state-of-the art results on handwritten text recognition tasks. While multi-directional MDLSTM-layers have an unbeaten ability to capture the complete context in all directions, this strength limits the possibilities for parallelization, and therefore comes at a high computational cost. In this work we develop methods to create efficient MDLSTM-based models for NHR, particularly a method aimed at eliminating computation waste that results from padding. This proposed method, called example-packing, replaces wasteful stacking of padded examples with efficient tiling in a 2-dimensional grid. For word-based NHR this yields a speed improvement of factor 6.6 over an already efficient baseline of minimal padding for each batch separately. For line-based NHR the savings are more modest, but still significant. In addition to example-packing, we propose: 1) a technique to optimize parallelization for dynamic graph definition frameworks including PyTorch, using convolutions with grouping, 2) a method for parallelization across GPUs for variable-length example batches. All our techniques are thoroughly tested on our own PyTorch re-implementation of MDLSTM-based NHR models. A thorough evaluation on the IAM dataset shows that our models are performing similar to earlier implementations of state-of-the-art models. Our efficient NHR model and some of the reusable techniques discussed with it offer ways to realize relatively efficient models for the omnipresent scenario of variable-length inputs in deep learning.

READ FULL TEXT

page 1

page 5

page 6

research
07/11/2022

A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

Cursive handwritten text recognition is a challenging research problem i...
research
04/28/2016

Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

Offline handwriting recognition systems require cropped text line images...
research
07/24/2017

LV-ROVER: Lexicon Verified Recognizer Output Voting Error Reduction

Offline handwritten text line recognition is a hard task that requires b...
research
08/02/2016

RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks

In this work we release our extensible and easily configurable neural ne...
research
12/09/2020

Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition ?

Unconstrained handwritten text recognition remains an important challeng...
research
12/14/2019

Efficient Convolutional Neural Networks for Diacritic Restoration

Diacritic restoration has gained importance with the growing need for ma...

Please sign up or login with your details

Forgot password? Click here to reset