TexT - Text Extractor Tool for Handwritten Document Transcription and Annotation

11/22/2017
by   Anders Hast, et al.
0

This paper presents a framework for semi-automatic transcription of large-scale historical handwritten documents and proposes a simple user-friendly text extractor tool, TexT for transcription. The proposed approach provides a quick and easy transcription of text using computer assisted interactive technique. The algorithm finds multiple occurrences of the marked text on-the-fly using a word spotting system. TexT is also capable of performing on-the-fly annotation of handwritten text with automatic generation of ground truth labels, and dynamic adjustment and correction of user generated bounding box annotations with the word being perfectly encapsulated. The user can view the document and the found words in the original form or with background noise removed for easier visualization of transcription results. The effectiveness of TexT is demonstrated on an archival manuscript collection from well-known publicly available dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2017

On-the-fly Historical Handwritten Text Annotation

The performance of information retrieval algorithms depends upon the ava...
research
08/09/2014

Automatic Removal of Marginal Annotations in Printed Text Document

Recovering the original printed texts from a document with added handwri...
research
09/06/2017

Automatic Document Image Binarization using Bayesian Optimization

Document image binarization is often a challenging task due to various f...
research
01/23/2020

Text Extraction and Restoration of Old Handwritten Documents

Image restoration is very crucial computer vision task. This paper descr...
research
04/24/2020

TeamTat: a collaborative text annotation tool

Manually annotated data is key to developing text-mining and information...
research
06/20/2022

Open Set Classification of Untranscribed Handwritten Documents

Huge amounts of digital page images of important manuscripts are preserv...
research
12/01/2017

Learning Deep Representations for Word Spotting Under Weak Supervision

Convolutional Neural Networks have made their mark in various fields of ...

Please sign up or login with your details

Forgot password? Click here to reset