Text line extraction using fully convolutional network and energy minimization

01/18/2021
by   Berat Kurar Barakat, et al.
15

Text lines are important parts of handwritten document images and easier to analyze by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC, and Diva-HisDB datasets. The VML-AHTE dataset contains overlapping, touching and close text lines with rich diacritics. The VML-MOC dataset is very challenging by its multiply oriented and skewed text lines. The Diva-HisDB dataset exhibits distinct text line heights and touching text lines. The results demonstrate the effectiveness of the method despite various types of challenges, yet using the same parameters in all the experiments.

READ FULL TEXT

page 2

page 6

page 11

page 13

research
01/19/2021

VML-MOC: Segmenting a multiply oriented and curved handwritten text lines dataset

This paper publishes a natural and very complicated dataset of handwritt...
research
01/20/2021

Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network

This paper presents a method for text line segmentation of challenging h...
research
08/29/2014

Text Line Identification in Tagore's Manuscript

In this paper, a text line identification method is proposed. The text l...
research
02/09/2018

A Two-Stage Method for Text Line Detection in Historical Documents

This work presents a two-stage text line detection method for historical...
research
07/23/2014

Joint Energy-based Detection and Classificationon of Multilingual Text Lines

This paper proposes a new hierarchical MDL-based model for a joint detec...
research
03/24/2022

A Simple Data-Driven Level Finding Method of Quantum Many-Body Systems based on Statistical Outlier Detection

We report a simple and pure data-driven method to find new energy levels...
research
03/17/2022

Unified Line and Paragraph Detection by Graph Convolutional Networks

We formulate the task of detecting lines and paragraphs in a document in...

Please sign up or login with your details

Forgot password? Click here to reset