Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Network

06/12/2019
by   Junho Jo, et al.
21

We present a new handwritten text segmentation method by training a convolutional neural network (CNN) in an end-to-end manner. Many conventional methods addressed this problem by extracting connected components and then classifying them. However, this two-step approach has limitations when handwritten components and machine-printed parts are overlapping. Unlike conventional methods, we develop an end-to-end deep CNN for this problem, which does not need any preprocessing steps. Since there is no publicly available dataset for this goal and pixel-wise annotations are time-consuming and costly, we also propose a data synthesis algorithm that generates realistic training samples. For training our network, we develop a cross-entropy based loss function that addresses the imbalance problems. Experimental results on synthetic and real images show the effectiveness of the proposed method. Specifically, the proposed network has been trained solely on synthetic images, nevertheless the removal of handwritten text in real documents improves OCR performance from 71.13 our network and synthesized images.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2018

An Efficient End-to-End Neural Model for Handwritten Text Recognition

Offline handwritten text recognition from images is an important problem...
research
05/31/2023

Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model

Constructing a highly accurate handwritten OCR system requires large amo...
research
01/24/2019

A PCB Dataset for Defects Detection and Classification

To coupe with the difficulties in the process of inspection and classifi...
research
06/07/2017

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network

We present an end-to-end, multimodal, fully convolutional network for ex...
research
11/19/2019

Weak Supervision for Generating Pixel-Level Annotations in Scene Text Segmentation

Providing pixel-level supervisions for scene text segmentation is inhere...
research
06/19/2023

Handwritten Text Recognition from Crowdsourced Annotations

In this paper, we explore different ways of training a model for handwri...
research
04/24/2018

Segmentation-Free Approaches for Handwritten Numeral String Recognition

This paper presents segmentation-free strategies for the recognition of ...

Please sign up or login with your details

Forgot password? Click here to reset