Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks

12/28/2020
by   Mélodie Boillet, et al.
9

In this paper, we introduce a fully convolutional network for the document layout analysis task. While state-of-the-art methods are using models pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that Doc-UFCN outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods.

READ FULL TEXT

page 1

page 3

page 4

page 6

research
03/04/2022

DiT: Self-supervised Pre-training for Document Image Transformer

Image Transformer has recently achieved significant progress for natural...
research
09/01/2021

Position Masking for Improved Layout-Aware Document Understanding

Natural language processing for document scans and PDFs has the potentia...
research
01/26/2018

PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization

Binarization of digital documents is the task of classifying each pixel ...
research
10/12/2022

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding

Recent years have witnessed the rise and success of pre-training techniq...
research
10/18/2020

Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks

We present an approach for adapting convolutional neural networks for ob...
research
07/19/2022

You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine

Layout Analysis (the identification of zones and their classification) i...
research
01/24/2022

Importance of Textlines in Historical Document Classification

This paper describes a system prepared at Brno University of Technology ...

Please sign up or login with your details

Forgot password? Click here to reset