CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging Handwritten Documents using Deep Feature Learning from JPEG Coefficients

08/11/2023
by   Bulla Rajesh, et al.
0

Automatic localization of text-lines in handwritten documents is still an open and challenging research problem. Various writing issues such as uneven spacing between the lines, oscillating and touching text, and the presence of skew become much more challenging when the case of complex handwritten document images are considered for segmentation directly in their respective compressed representation. This is because, the conventional way of processing compressed documents is through decompression, but here in this paper, we propose an idea that employs deep feature learning directly from the JPEG compressed coefficients without full decompression to accomplish text-line localization in the JPEG compressed domain. A modified U-Net architecture known as Compressed Text-Line Localization Network (CompTLL-UNet) is designed to accomplish it. The model is trained and tested with JPEG compressed version of benchmark datasets including ICDAR2017 (cBAD) and ICDAR2019 (cBAD), reporting the state-of-the-art performance with reduced storage and computational costs in the JPEG compressed domain.

READ FULL TEXT

page 3

page 4

page 8

page 11

page 13

research
07/29/2019

Automatic Text Line Segmentation Directly in JPEG Compressed Document Images

JPEG is one of the popular image compression algorithms that provide eff...
research
01/03/2019

Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm

In this research work, we perform text line segmentation directly in com...
research
07/02/2020

Automatic Page Segmentation Without Decompressing the Run-Length Compressed Text Documents

Page segmentation is considered to be the crucial stage for the automati...
research
01/04/2022

HWRCNet: Handwritten Word Recognition in JPEG Compressed Domain using CNN-BiLSTM Network

The handwritten word recognition from images using deep learning is an a...
research
09/13/2022

OCR for TIFF Compressed Document Images Directly in Compressed Domain Using Text segmentation and Hidden Markov Model

In today's technological era, document images play an important and inte...
research
06/02/2023

DWT-CompCNN: Deep Image Classification Network for High Throughput JPEG 2000 Compressed Documents

For any digital application with document images such as retrieval, the ...
research
04/18/2021

Line Segmentation from Unconstrained Handwritten Text Images using Adaptive Approach

Line segmentation from handwritten text images is one of the challenging...

Please sign up or login with your details

Forgot password? Click here to reset