Maximum Entropy Regularization and Chinese Text Recognition

by   Changxu Cheng, et al.

Chinese text recognition is more challenging than Latin text due to the large amount of fine-grained Chinese characters and the great imbalance over classes, which causes a serious overfitting problem. We propose to apply Maximum Entropy Regularization to regularize the training process, which is to simply add a negative entropy term to the canonical cross-entropy loss without any additional parameters and modification of a model. We theoretically give the convergence probability distribution and analyze how the regularization influence the learning process. Experiments on Chinese character recognition, Chinese text line recognition and fine-grained image classification achieve consistent improvement, proving that the regularization is beneficial to generalization and robustness of a recognition model.


Maximum-Entropy Fine-Grained Classification

Fine-Grained Visual Classification (FGVC) is an important computer visio...

Handwritten Chinese Character Recognition by Convolutional Neural Network and Similarity Ranking

Convolution Neural Networks (CNN) have recently achieved state-of-the ar...

Fine-grained Classification of Solder Joints with α-skew Jensen-Shannon Divergence

Solder joint inspection (SJI) is a critical process in the production of...

CLUENER2020: Fine-grained Name Entity Recognition for Chinese

In this paper, we introduce the NER dataset from CLUE organization (CLUE...

CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese

In this paper, we introduce the NER dataset from CLUE organization (CLUE...

Traditional Chinese Synthetic Datasets Verified with Labeled Data for Scene Text Recognition

Scene text recognition (STR) has been widely studied in academia and ind...

Preserving Fine-Grain Feature Information in Classification via Entropic Regularization

Labeling a classification dataset implies to define classes and associat...