Image Processing Based Scene-Text Detection and Recognition with Tesseract

04/17/2020
by   Ebin Zacharias, et al.
0

Text Recognition is one of the challenging tasks of computer vision with considerable practical interest. Optical character recognition (OCR) enables different applications for automation. This project focuses on word detection and recognition in natural images. In comparison to reading text in scanned documents, the targeted problem is significantly more challenging. The use case in focus facilitates the possibility to detect the text area in natural scenes with greater accuracy because of the availability of images under constraints. This is achieved using a camera mounted on a truck capturing likewise images round-the-clock. The detected text area is then recognized using Tesseract OCR engine. Even though it benefits low computational power requirements, the model is limited to only specific use cases. This paper discusses a critical false positive case scenario occurred while testing and elaborates the strategy used to alleviate the problem. The project achieved a correct character recognition rate of more than 80%. This paper outlines the stages of development, the major challenges and some of the interesting findings of the project.

READ FULL TEXT

page 3

page 4

research
04/26/2013

Reading Ancient Coin Legends: Object Recognition vs. OCR

Standard OCR is a well-researched topic of computer vision and can be co...
research
04/17/2020

Object Detection and Recognition of Swap-Bodies using Camera mounted on a Vehicle

Object detection and identification is a challenging area of computer vi...
research
01/26/2016

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

This paper describes the COCO-Text dataset. In recent years large-scale ...
research
09/15/2011

Design of an Optical Character Recognition System for Camera-based Handheld Devices

This paper presents a complete Optical Character Recognition (OCR) syste...
research
09/11/2015

OCR accuracy improvement on document images through a novel pre-processing approach

Digital camera and mobile document image acquisition are new trends aris...
research
08/18/2020

Robust Handwriting Recognition with Limited and Noisy Data

Despite the advent of deep learning in computer vision, the general hand...
research
12/16/2021

Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering

In the past, computer vision systems for digitized documents could rely ...

Please sign up or login with your details

Forgot password? Click here to reset