Recognition of Handwritten Textual Annotations using Tesseract Open Source OCR Engine for information Just In Time (iJIT)

03/30/2010
by   Sandip Rakshit, et al.
0

Objective of the current work is to develop an Optical Character Recognition (OCR) engine for information Just In Time (iJIT) system that can be used for recognition of handwritten textual annotations of lower case Roman script. Tesseract open source OCR engine under Apache License 2.0 is used to develop user-specific handwriting recognition models, viz., the language sets, for the said system, where each user is identified by a unique identification tag associated with the digital pen. To generate the language set for any user, Tesseract is trained with labeled handwritten data samples of isolated and free-flow texts of Roman script, collected exclusively from that user. The designed system is tested on five different language sets with free- flow handwritten annotations as test samples. The system could successfully segment and subsequently recognize 87.92 handwritten characters in the test samples of five different users.

READ FULL TEXT
research
03/30/2010

Development of a multi-user handwriting recognition system using Tesseract open source OCR engine

The objective of the paper is to recognize handwritten samples of lower ...
research
03/30/2010

Recognition of Handwritten Roman Script Using Tesseract Open source OCR Engine

In the present work, we have used Tesseract 2.01 open source Optical Cha...
research
03/30/2010

Development of a Multi-User Recognition Engine for Handwritten Bangla Basic Characters and Digits

The objective of the paper is to recognize handwritten samples of basic ...
research
03/30/2010

Recognition of handwritten Roman Numerals using Tesseract open source OCR engine

The objective of the paper is to recognize handwritten samples of Roman ...
research
06/30/2010

Classification Of Gradient Change Features Using MLP For Handwritten Character Recognition

A novel, generic scheme for off-line handwritten English alphabets chara...
research
03/13/2021

uTHCD: A New Benchmarking for Tamil Handwritten OCR

Handwritten character recognition is a challenging research in the field...
research
06/19/2023

Handwritten Text Recognition from Crowdsourced Annotations

In this paper, we explore different ways of training a model for handwri...

Please sign up or login with your details

Forgot password? Click here to reset