Automatic Document Image Binarization using Bayesian Optimization

09/06/2017
by   Ekta Vats, et al.
0

Document image binarization is often a challenging task due to various forms of degradation. Although there exist several binarization techniques in literature, the binarized image is typically sensitive to control parameter settings of the employed technique. This paper presents an automatic document image binarization algorithm to segment the text from heavily degraded document images. The proposed technique uses a two band-pass filtering approach for background noise removal, and Bayesian optimization for automatic hyperparameter selection for optimal results. The effectiveness of the proposed binarization technique is empirically demonstrated on the Document Image Binarization Competition (DIBCO) and the Handwritten Document Image Binarization Competition (H-DIBCO) datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2017

Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing

Computation of document image quality metrics often depends upon the ava...
research
11/22/2017

TexT - Text Extractor Tool for Handwritten Document Transcription and Annotation

This paper presents a framework for semi-automatic transcription of larg...
research
11/13/2019

BiNet: Degraded-Manuscript Binarization in Diverse Document Textures and Layouts using Deep Encoder-Decoder Networks

Handwritten document-image binarization is a semantic segmentation proce...
research
06/06/2013

K-Algorithm A Modified Technique for Noise Removal in Handwritten Documents

OCR has been an active research area since last few decades. OCR perform...
research
01/27/2021

HDIB1M – Handwritten Document Image Binarization 1 Million Dataset

Handwritten document image binarization is a challenging task due to hig...
research
05/31/2020

Modified Segmentation Algorithm for Recognition of Older Geez Scripts Written on Vellum

Recognition of handwritten document aims at transforming document images...
research
03/07/2015

An Improved Image Mosaicing Algorithm for Damaged Documents

It is a common phenomenon in day to day life; where in some of the docum...

Please sign up or login with your details

Forgot password? Click here to reset