DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

03/23/2022
by   Denis Coquenet, et al.
0

Unconstrained handwritten document recognition is a challenging computer vision task. It is traditionally handled by a two-step approach combining line segmentation followed by text line recognition. For the first time, we propose an end-to-end segmentation-free architecture for the task of handwritten document recognition: the Document Attention Network. In addition to the text recognition, the model is trained to label text parts using begin and end tags in an XML-like fashion. This model is made up of an FCN encoder for feature extraction and a stack of transformer decoder layers for a recurrent token-by-token prediction process. It takes whole text documents as input and sequentially outputs characters, as well as logical layout tokens. Contrary to the existing segmentation-based approaches, the model is trained without using any segmentation label. We achieve competitive results on the READ dataset at page level, as well as double-page level with a CER of 3.53 respectively. We also provide results for the RIMES dataset at page level, reaching 4.54 We provide all source code and pre-trained model weights at https://github.com/FactoDeepLearning/DAN.

READ FULL TEXT

page 1

page 10

page 14

research
01/25/2023

Faster DAN: Multi-target Queries with Document Positional Encoding for End-to-end Handwritten Document Recognition

Recent advances in handwritten text recognition enabled to recognize who...
research
12/07/2020

End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network

Unconstrained handwritten text recognition remains challenging for compu...
research
09/30/2022

Towards End-to-end Handwritten Document Recognition

Handwritten text recognition has been widely studied in the last decades...
research
03/11/2021

Full Page Handwriting Recognition via Image to Sequence Extraction

We present a Neural Network based Handwritten Text Recognition (HTR) mod...
research
09/22/2020

Whole page recognition of historical handwriting

Historical handwritten documents guard an important part of human knowle...
research
03/24/2023

MSdocTr-Lite: A Lite Transformer for Full Page Multi-script Handwriting Recognition

The Transformer has quickly become the dominant architecture for various...
research
02/17/2021

SPAN: a Simple Predict Align Network for Handwritten Paragraph Recognition

Unconstrained handwriting recognition is an essential task in document a...

Please sign up or login with your details

Forgot password? Click here to reset