Maximum Entropy Regularization and Chinese Text Recognition

07/09/2020
by   Changxu Cheng, et al.
0

Chinese text recognition is more challenging than Latin text due to the large amount of fine-grained Chinese characters and the great imbalance over classes, which causes a serious overfitting problem. We propose to apply Maximum Entropy Regularization to regularize the training process, which is to simply add a negative entropy term to the canonical cross-entropy loss without any additional parameters and modification of a model. We theoretically give the convergence probability distribution and analyze how the regularization influence the learning process. Experiments on Chinese character recognition, Chinese text line recognition and fine-grained image classification achieve consistent improvement, proving that the regularization is beneficial to generalization and robustness of a recognition model.

READ FULL TEXT
09/16/2018

Maximum-Entropy Fine-Grained Classification

Fine-Grained Visual Classification (FGVC) is an important computer visio...
08/30/2019

Handwritten Chinese Character Recognition by Convolutional Neural Network and Similarity Ranking

Convolution Neural Networks (CNN) have recently achieved state-of-the ar...
09/20/2022

Fine-grained Classification of Solder Joints with α-skew Jensen-Shannon Divergence

Solder joint inspection (SJI) is a critical process in the production of...
01/13/2020

CLUENER2020: Fine-grained Name Entity Recognition for Chinese

In this paper, we introduce the NER dataset from CLUE organization (CLUE...
01/13/2020

CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese

In this paper, we introduce the NER dataset from CLUE organization (CLUE...
11/26/2021

Traditional Chinese Synthetic Datasets Verified with Labeled Data for Scene Text Recognition

Scene text recognition (STR) has been widely studied in academia and ind...
08/07/2022

Preserving Fine-Grain Feature Information in Classification via Entropic Regularization

Labeling a classification dataset implies to define classes and associat...