Text Extraction and Restoration of Old Handwritten Documents

01/23/2020
by   Mayank Wadhwani, et al.
20

Image restoration is very crucial computer vision task. This paper describes two novel methods for the restoration of old degraded handwritten documents using deep neural network. In addition to that, a small-scale dataset of 26 heritage letters images is introduced. The ground truth data to train the desired network is generated semi automatically involving a pragmatic combination of color transformation, Gaussian mixture model based segmentation and shape correction by using mathematical morphological operators. In the first approach, a deep neural network has been used for text extraction from the document image and later background reconstruction has been done using Gaussian mixture modeling. But Gaussian mixture modelling requires to set parameters manually, to alleviate this we propose a second approach where the background reconstruction and foreground extraction (which which includes extracting text with its original colour) both has been done using deep neural network. Experiments demonstrate that the proposed systems perform well on handwritten document images with severe degradations, even when trained with small dataset. Hence, the proposed methods are ideally suited for digital heritage preservation repositories. It is worth mentioning that, these methods can be extended easily for printed degraded documents.

READ FULL TEXT

page 14

page 15

page 16

research
12/10/2019

3D-GMNet: Learning to Estimate 3D Shape from A Single Image As A Gaussian Mixture

In this paper, we introduce 3D-GMNet, a deep neural network for single-i...
research
10/09/2017

A Bottom Up Procedure for Text Line Segmentation of Latin Script

In this paper we present a bottom up procedure for segmentation of text ...
research
11/22/2017

TexT - Text Extractor Tool for Handwritten Document Transcription and Annotation

This paper presents a framework for semi-automatic transcription of larg...
research
02/12/2016

Image Restoration and Reconstruction using Variable Splitting and Class-adapted Image Priors

This paper proposes using a Gaussian mixture model as a prior, for solvi...
research
11/13/2019

BiNet: Degraded-Manuscript Binarization in Diverse Document Textures and Layouts using Deep Encoder-Decoder Networks

Handwritten document-image binarization is a semantic segmentation proce...
research
05/23/2016

Image Restoration with Locally Selected Class-Adapted Models

State-of-the-art algorithms for imaging inverse problems (namely deblurr...
research
02/05/2022

ROMNet: Renovate the Old Memories

Renovating the memories in old photos is an intriguing research topic in...

Please sign up or login with your details

Forgot password? Click here to reset