Unsupervised learning of text line segmentation by differentiating coarse patterns

05/19/2021
by   Berat Kurar Barakat, et al.
22

Despite recent advances in the field of supervised deep learning for text line segmentation, unsupervised deep learning solutions are beginning to gain popularity. In this paper, we present an unsupervised deep learning method that embeds document image patches to a compact Euclidean space where distances correspond to a coarse text line pattern similarity. Once this space has been produced, text line segmentation can be easily implemented using standard techniques with the embedded feature vectors. To train the model, we extract random pairs of document image patches with the assumption that neighbour patches contain a similar coarse trend of text lines, whereas if one of them is rotated, they contain different coarse trends of text lines. Doing well on this task requires the model to learn to recognize the text lines and their salient parts. The benefit of our approach is zero manual labelling effort. We evaluate the method qualitatively and quantitatively on several variants of text line segmentation datasets to demonstrate its effectivity.

READ FULL TEXT

page 1

page 9

page 10

page 11

page 12

page 13

research
03/19/2020

Unsupervised text line segmentation

We present an unsupervised text line segmentation method that is inspire...
research
01/19/2021

Unsupervised Deep Learning for Handwritten Page Segmentation

Segmenting handwritten document images into regions with homogeneous pat...
research
02/03/2023

The Learnable Typewriter: A Generative Approach to Text Line Analysis

We present a generative document-specific approach to character analysis...
research
08/29/2014

Text Line Identification in Tagore's Manuscript

In this paper, a text line identification method is proposed. The text l...
research
10/09/2017

A Bottom Up Procedure for Text Line Segmentation of Latin Script

In this paper we present a bottom up procedure for segmentation of text ...
research
01/03/2019

Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm

In this research work, we perform text line segmentation directly in com...
research
03/16/2021

Combining Morphological and Histogram based Text Line Segmentation in the OCR Context

Text line segmentation is one of the pre-stages of modern optical charac...

Please sign up or login with your details

Forgot password? Click here to reset