Marginalia and machine learning: Handwritten text recognition for Marginalia Collections

03/10/2023
by   Adam Axelsson, et al.
0

The pressing need for digitization of historical document collections has led to a strong interest in designing computerised image processing methods for automatic handwritten text recognition (HTR). Handwritten text possesses high variability due to different writing styles, languages and scripts. Training an accurate and robust HTR system calls for data-efficient approaches due to the unavailability of sufficient amounts of annotated multi-writer text. A case study on an ongoing project “Marginalia and Machine Learning" is presented here that focuses on automatic detection and recognition of handwritten marginalia texts i.e., text written in margins or handwritten notes. Faster R-CNN network is used for detection of marginalia and AttentionHTR is used for word recognition. The data comes from early book collections (printed) found in the Uppsala University Library, with handwritten marginalia texts. Source code and pretrained models are available at https://github.com/ektavats/Project-Marginalia.

READ FULL TEXT

page 2

page 3

page 4

page 6

page 7

research
10/02/2022

DARE: A large-scale handwritten date recognition system

Handwritten text recognition for historical documents is an important ta...
research
09/18/2019

Unsupervised Writer Adaptation for Synthetic-to-Real Handwritten Word Recognition

Handwritten Text Recognition (HTR) is still a challenging problem becaus...
research
06/07/2021

Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

Machine learning approaches achieve high accuracy for text recognition a...
research
03/28/2023

Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets

The paper discusses an approach to decipher large collections of handwri...
research
09/21/2022

A Few Shot Multi-Representation Approach for N-gram Spotting in Historical Manuscripts

Despite recent advances in automatic text recognition, the performance r...
research
05/04/2023

How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning

Recent advancements in Deep Learning-based Handwritten Text Recognition ...
research
11/22/2021

Many Heads but One Brain: an Overview of Fusion Brain Challenge on AI Journey 2021

Supporting the current trend in the AI community, we propose the AI Jour...

Please sign up or login with your details

Forgot password? Click here to reset