PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System

09/07/2021
by   Yuning Du, et al.
70

Optical Character Recognition (OCR) systems have been widely used in various of application scenarios. Designing an OCR system is still a challenging task. In previous work, we proposed a practical ultra lightweight OCR system (PP-OCR) to balance the accuracy against the efficiency. In order to improve the accuracy of PP-OCR and keep high efficiency, in this paper, we propose a more robust OCR system, i.e. PP-OCRv2. We introduce bag of tricks to train a better text detector and a better text recognizer, which include Collaborative Mutual Learning (CML), CopyPaste, Lightweight CPUNetwork (LCNet), Unified-Deep Mutual Learning (U-DML) and Enhanced CTCLoss. Experiments on real data show that the precision of PP-OCRv2 is 7 It is also comparable to the server models of the PP-OCR which uses ResNet series as backbones. All of the above mentioned models are open-sourced and the code is available in the GitHub repository PaddleOCR which is powered by PaddlePaddle.

READ FULL TEXT

page 1

page 3

page 5

page 8

research
06/07/2022

PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

Optical character recognition (OCR) technology has been widely used in v...
research
09/21/2020

PP-OCR: A Practical Ultra Lightweight OCR System

The Optical Character Recognition (OCR) systems have been widely used in...
research
11/01/2021

PP-ShiTu: A Practical Lightweight Image Recognition System

In recent years, image recognition applications have developed rapidly. ...
research
10/11/2022

PP-StructureV2: A Stronger Document Analysis System

A large amount of document data exists in unstructured form such as raw ...
research
04/04/2022

Rediscovery of the Effectiveness of Standard Convolution for Lightweight Face Detection

This paper analyses the design choices of face detection architecture th...
research
05/17/2021

Unknown-box Approximation to Improve Optical Character Recognition Performance

Optical character recognition (OCR) is a widely used pattern recognition...
research
02/23/2023

LightCTS: A Lightweight Framework for Correlated Time Series Forecasting

Correlated time series (CTS) forecasting plays an essential role in many...

Please sign up or login with your details

Forgot password? Click here to reset