You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine

07/19/2022
by   Thibault Clérice, et al.
0

Layout Analysis (the identification of zones and their classification) is the first step along line segmentation in Optical Character Recognition and similar tasks. The ability of identifying main body of text from marginal text or running titles makes the difference between extracting the work full text of a digitized book and noisy outputs. We show that most segmenters focus on pixel classification and that polygonization of this output has not been used as a target for the latest competition on historical document (ICDAR 2017 and onwards), despite being the focus in the early 2010s. We propose to shift, for efficiency, the task from a pixel classification-based polygonization to an object detection using isothetic rectangles. We compare the output of Kraken and YOLOv5 in terms of segmentation and show that the later severely outperforms the first on small datasets (1110 samples and below). We release two datasets for training and evaluation on historical documents as well as a new package, YALTAi, which injects YOLOv5 in the segmentation pipeline of Kraken 4.1.

READ FULL TEXT

page 3

page 6

page 8

research
12/10/2020

HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in Historical Documents

The information provided by historical documents has always been indispe...
research
03/23/2022

Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods

Text line segmentation is one of the key steps in historical document un...
research
05/09/2017

READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

Text line detection is crucial for any application associated with Autom...
research
01/27/2023

Détection d'Objets dans les documents numérisés par réseaux de neurones profonds

In this thesis, we study multiple tasks related to document layout analy...
research
12/28/2020

Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks

In this paper, we introduce a fully convolutional network for the docume...
research
02/03/2023

The Learnable Typewriter: A Generative Approach to Text Line Analysis

We present a generative document-specific approach to character analysis...
research
01/24/2022

Importance of Textlines in Historical Document Classification

This paper describes a system prepared at Brno University of Technology ...

Please sign up or login with your details

Forgot password? Click here to reset