Faster DAN: Multi-target Queries with Document Positional Encoding for End-to-end Handwritten Document Recognition

01/25/2023
by   Denis Coquenet, et al.
0

Recent advances in handwritten text recognition enabled to recognize whole documents in an end-to-end way: the Document Attention Network (DAN) recognizes the characters one after the other through an attention-based prediction process until reaching the end of the document. However, this autoregressive process leads to inference that cannot benefit from any parallelization optimization. In this paper, we propose Faster DAN, a two-step strategy to speed up the recognition process at prediction time: the model predicts the first character of each text line in the document, and then completes all the text lines in parallel through multi-target queries and a specific document positional encoding scheme. Faster DAN reaches competitive results compared to standard DAN, while being at least 4 times faster on whole single-page and double-page images of the RIMES 2009, READ 2016 and MAURDOR datasets. Source code and trained model weights are available at https://github.com/FactoDeepLearning/FasterDAN.

READ FULL TEXT

page 2

page 10

research
03/23/2022

DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

Unconstrained handwritten document recognition is a challenging computer...
research
12/07/2020

End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network

Unconstrained handwritten text recognition remains challenging for compu...
research
09/30/2022

Towards End-to-end Handwritten Document Recognition

Handwritten text recognition has been widely studied in the last decades...
research
02/17/2021

SPAN: a Simple Predict Align Network for Handwritten Paragraph Recognition

Unconstrained handwriting recognition is an essential task in document a...
research
05/24/2021

LineCounter: Learning Handwritten Text Line Segmentation by Counting

Handwritten Text Line Segmentation (HTLS) is a low-level but important t...
research
01/14/2023

End-to-End Page-Level Assessment of Handwritten Text Recognition

The evaluation of Handwritten Text Recognition (HTR) systems has traditi...
research
05/12/2021

SauvolaNet: Learning Adaptive Sauvola Network for Degraded Document Binarization

Inspired by the classic Sauvola local image thresholding approach, we sy...

Please sign up or login with your details

Forgot password? Click here to reset