Field typing for improved recognition on heterogeneous handwritten forms

09/23/2019
by   Ciprian Tomoiaga, et al.
0

Offline handwriting recognition has undergone continuous progress over the past decades. However, existing methods are typically benchmarked on free-form text datasets that are biased towards good-quality images and handwriting styles, and homogeneous content. In this paper, we show that state-of-the-art algorithms, employing long short-term memory (LSTM) layers, do not readily generalize to real-world structured documents, such as forms, due to their highly heterogeneous and out-of-vocabulary content, and to the inherent ambiguities of this content. To address this, we propose to leverage the content type within an LSTM-based architecture. Furthermore, we introduce a procedure to generate synthetic data to train this architecture without requiring expensive manual annotations. We demonstrate the effectiveness of our approach at transcribing text on a challenging, real-world dataset of European Accident Statements.

READ FULL TEXT

page 5

page 6

research
10/01/2019

A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

Offline handwriting recognition with deep neural networks is usually lim...
research
01/18/2019

DA-LSTM: A Long Short-Term Memory with Depth Adaptive to Non-uniform Information Flow in Sequential Data

Much sequential data exhibits highly non-uniform information distributio...
research
06/20/2023

A Deep Learning Model for Heterogeneous Dataset Analysis – Application to Winter Wheat Crop Yield Prediction

Western countries rely heavily on wheat, and yield prediction is crucial...
research
07/11/2016

Recurrent Memory Array Structures

The following report introduces ideas augmenting standard Long Short Ter...
research
11/27/2018

Are 2D-LSTM really dead for offline text recognition?

There is a recent trend in handwritten text recognition with deep neural...
research
02/19/2018

LSTM stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)

The alignment of heterogeneous sequential data (video to text) is an imp...
research
06/01/2018

Synchronous Prediction of Arousal and Valence Using LSTM Network for Affective Video Content Analysis

The affect embedded in video data conveys high-level semantic informatio...

Please sign up or login with your details

Forgot password? Click here to reset