Few Shots Is All You Need: A Progressive Few Shot Learning Approach for Low Resource Handwriting Recognition

07/21/2021
by   Mohamed Ali Souibgui, et al.
0

Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. The main difficulty comes from the very few annotated data and the limited linguistic information (e.g. dictionaries and language models). Thus, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human labor annotation process, requiring only few images of each alphabet symbol. First, our model detects all symbols of a given alphabet in a textline image, then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from any alphabet, even though different from the target domain. A second training step is then applied to diminish the gap between the source and target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the non-annotated data. The evaluation on different manuscript datasets show that our model can lead to competitive results with a significant reduction in human effort.

READ FULL TEXT
research
09/26/2020

A Few-shot Learning Approach for Historical Ciphered Manuscript Recognition

Encoded (or ciphered) manuscripts are a special type of historical docum...
research
05/11/2021

One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition

Low resource Handwritten Text Recognition (HTR) is a hard problem due to...
research
12/19/2017

Cross-language Framework for Word Recognition and Spotting of Indic Scripts

Handwritten word recognition and spotting of low-resource scripts are di...
research
11/29/2015

On-line Recognition of Handwritten Mathematical Symbols

Finding the name of an unknown symbol is often hard, but writing the sym...
research
04/03/2023

OTS: A One-shot Learning Approach for Text Spotting in Historical Manuscripts

Historical manuscript processing poses challenges like limited annotated...
research
09/08/2021

OSSR-PID: One-Shot Symbol Recognition in P ID Sheets using Path Sampling and GCN

Piping and Instrumentation Diagrams (P ID) are ubiquitous in several m...
research
07/10/2019

Fully Convolutional Networks for Handwriting Recognition

Handwritten text recognition is challenging because of the virtually inf...

Please sign up or login with your details

Forgot password? Click here to reset