Handwriting recognition using Cohort of LSTM and lexicon verification with extremely large lexicon

12/22/2016
by   Bruno Stuner, et al.
0

State-of-the-art methods for handwriting recognition are based on Long Short Term Memory (LSTM) recurrent neural networks (RNN), which now provides very impressive character recognition performance. The character recognition is generally coupled with a lexicon driven decoding process which integrates dictionaries. Unfortunately these dictionaries are limited to hundred of thousands words for the best systems, which prevent from having a good language coverage, and therefore limit the global recognition performance. In this article, we propose an alternative to the lexicon driven decoding process based on a lexicon verification process, coupled with an original cascade architecture. The cascade is made of a large number of complementary networks extracted from a single training (called cohort), making the learning process very light. The proposed method achieves new state-of-the art word recognition performance on the Rimes and IAM databases. Dealing with gigantic lexicon of 3 millions words, the methods also demonstrates interesting performance with a fast decision stage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2018

French Word Recognition through a Quick Survey on Recurrent Neural Networks Using Long-Short Term Memory RNN-LSTM

Optical character recognition (OCR) is a fundamental problem in computer...
research
09/06/2017

Scene Text Recognition with Sliding Convolutional Character Models

Scene text recognition has attracted great interests from the computer v...
research
07/24/2017

LV-ROVER: Lexicon Verified Recognizer Output Voting Error Reduction

Offline handwritten text line recognition is a hard task that requires b...
research
01/31/2021

Fine-tuning Handwriting Recognition systems with Temporal Dropout

This paper introduces a novel method to fine-tune handwriting recognitio...
research
04/15/2023

TransDocs: Optical Character Recognition with word to word translation

While OCR has been used in various applications, its output is not alway...
research
06/30/2016

Recurrent neural network models for disease name recognition using domain invariant features

Hand-crafted features based on linguistic and domain-knowledge play cruc...

Please sign up or login with your details

Forgot password? Click here to reset