Writer-Aware CNN for Parsimonious HMM-Based Offline Handwritten Chinese Text Recognition

12/24/2018
by   Zi-Rui Wang, et al.
0

Recently, the hybrid convolutional neural network hidden Markov model (CNN-HMM) has been introduced for offline handwritten Chinese text recognition (HCTR) and has achieved state-of-the-art performance. In a CNN-HMM system, a handwritten text line is modeled by a series of cascading HMMs, each representing one character, and the posterior distributions of HMM states are calculated by CNN. However, modeling each of the large vocabulary of Chinese characters with a uniform and fixed number of hidden states requires high memory and computational costs and makes the tens of thousands of HMM state classes confusing. Another key issue of CNN-HMM for HCTR is the diversified writing style, which leads to model strain and a significant performance decline for specific writers. To address these issues, we propose a writer-aware CNN based on parsimonious HMM (WCNN-PHMM). Validated on the ICDAR 2013 competition of CASIA-HWDB database, the more compact WCNN-PHMM of a 7360-class vocabulary can achieve a relative character error rate (CER) reduction of 16.6 modeling. Moreover, the state-tying results of PHMM explicitly show the information sharing among similar characters and the confusion reduction of tied state classes. Finally, we visualize the learned writer codes and demonstrate the strong relationship with the writing styles of different writers. To the best of our knowledge, WCNN-PHMM yields the best results on the ICDAR 2013 competition set, demonstrating its power when enlarging the size of the character vocabulary.

READ FULL TEXT
research
08/13/2018

Parsimonious HMMs for Offline Handwritten Chinese Text Recognition

Recently, hidden Markov models (HMMs) have achieved promising results fo...
research
06/28/2020

Offline Handwritten Chinese Text Recognition with Convolutional Neural Networks

Deep learning based methods have been dominating the text recognition ta...
research
10/12/2019

Template-Instance Loss for Offline Handwritten Chinese Character Recognition

The long-standing challenges for offline handwritten Chinese character r...
research
01/22/2018

Trajectory-based Radical Analysis Network for Online Handwritten Chinese Character Recognition

Recently, great progress has been made for online handwritten Chinese ch...
research
06/18/2016

Online and Offline Handwritten Chinese Character Recognition: A Comprehensive Study and New Benchmark

Recent deep learning based methods have achieved the state-of-the-art pe...
research
05/26/2020

Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition

The advent of recurrent neural networks for handwriting recognition mark...
research
03/13/2021

uTHCD: A New Benchmarking for Tamil Handwritten OCR

Handwritten character recognition is a challenging research in the field...

Please sign up or login with your details

Forgot password? Click here to reset