Unsupervised Deep Learning for Handwritten Page Segmentation

01/19/2021
by   Ahmad Droby, et al.
0

Segmenting handwritten document images into regions with homogeneous patterns is an important pre-processing step for many document images analysis tasks. Hand-labeling data to train a deep learning model for layout analysis requires significant human effort. In this paper, we present an unsupervised deep learning method for page segmentation, which revokes the need for annotated images. A siamese neural network is trained to differentiate between patches using their measurable properties such as number of foreground pixels, and average component height and width. The network is trained that spatially nearby patches are similar. The network's learned features are used for page segmentation, where patches are classified as main and side text based on the extracted features. We tested the method on a dataset of handwritten document images with quite complex layouts. Our experiments show that the proposed unsupervised method is as effective as typical supervised methods.

READ FULL TEXT

page 3

page 5

research
03/19/2020

Unsupervised text line segmentation

We present an unsupervised text line segmentation method that is inspire...
research
04/05/2017

Convolutional Neural Networks for Page Segmentation of Historical Document Images

This paper presents a Convolutional Neural Network (CNN) based page segm...
research
01/27/2021

HDIB1M – Handwritten Document Image Binarization 1 Million Dataset

Handwritten document image binarization is a challenging task due to hig...
research
05/19/2021

Unsupervised learning of text line segmentation by differentiating coarse patterns

Despite recent advances in the field of supervised deep learning for tex...
research
11/21/2017

Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images

We propose a high-performance fully convolutional neural network (FCN) f...
research
12/06/2019

Performing Arithmetic Using a Neural Network Trained on Digit Permutation Pairs

In this paper a neural network is trained to perform simple arithmetic u...
research
04/07/2019

Measuring Human Perception to Improve Handwritten Document Transcription

The subtleties of human perception, as measured by vision scientists thr...

Please sign up or login with your details

Forgot password? Click here to reset