A syllable based model for handwriting recognition

08/22/2018
by   Wassim Swaileh, et al.
0

In this paper, we introduce a new modeling approach of texts for handwriting recognition based on syllables. We propose a supervised syllabification approach for the French and English languages for building a vocabulary of syllables. Statistical n-gram language models of syllables are trained on French and English Wikipedia corpora. The handwriting recognition system, based on optical HMM context independent character models, performs a two pass decoding, integrating the proposed syllabic models. Evaluation is carried out on the French RIMES dataset and English IAM dataset by analyzing the performance for various coverage of the syllable models. We also compare the syllable models with lexicon and character n-gram models. The proposed approach reaches interesting performances thanks to its capacity to cover a large amount of out of vocabulary words working with a limited amount of syllables combined with statistical n-gram of reasonable order.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2020

Subword RNNLM Approximations for Out-Of-Vocabulary Keyword Search

In spoken Keyword Search, the query may contain out-of-vocabulary (OOV) ...
research
05/21/2019

Approximating probabilistic models as weighted finite automata

Weighted finite automata (WFA) are often used to represent probabilistic...
research
01/08/2017

Sentence-level dialects identification in the greater China region

Identifying the different varieties of the same language is more challen...
research
08/11/2017

N-gram and Neural Language Models for Discriminating Similar Languages

This paper describes our submission (named clac) to the 2016 Discriminat...
research
05/05/2023

Adapting Transformer Language Models for Predictive Typing in Brain-Computer Interfaces

Brain-computer interfaces (BCI) are an important mode of alternative and...
research
05/25/2023

CENSUS-HWR: a large training dataset for offline handwriting recognition

Progress in Automated Handwriting Recognition has been hampered by the l...
research
02/04/2021

One Size Does Not Fit All: Finding the Optimal N-gram Sizes for FastText Models across Languages

Unsupervised word representation learning from large corpora is badly ne...

Please sign up or login with your details

Forgot password? Click here to reset