STRIDE : Scene Text Recognition In-Device

by   Rachit S Munjal, et al.

Optical Character Recognition (OCR) systems have been widely used in various applications for extracting semantic information from images. To give the user more control over their privacy, an on-device solution is needed. The current state-of-the-art models are too heavy and complex to be deployed on-device. We develop an efficient lightweight scene text recognition (STR) system, which has only 0.88M parameters and performs real-time text recognition. Attention modules tend to boost the accuracy of STR networks but are generally slow and not optimized for device inference. So, we propose the use of convolution attention modules to the text recognition networks, which aims to provide channel and spatial attention information to the LSTM module by adding very minimal computational cost. It boosts our word accuracy on ICDAR 13 dataset by almost 2%. We also introduce a novel orientation classifier module, to support the simultaneous recognition of both horizontal and vertical text. The proposed model surpasses on-device metrics of inference time and memory footprint and achieves comparable accuracy when compared to the leading commercial and other open-source OCR engines. We deploy the system on-device with an inference speed of 2.44 ms per word on the Exynos 990 chipset device and achieve an accuracy of 88.4% on ICDAR-13 dataset.



There are no comments yet.


page 1

page 4

page 5

page 7


TeLCoS: OnDevice Text Localization with Clustering of Script

Recent research in the field of text localization in a resource constrai...

FONTNET: On-Device Font Understanding and Prediction Pipeline

Fonts are one of the most basic and core design concepts. Numerous use c...

Double Supervised Network with Attention Mechanism for Scene Text Recognition

In this paper, we propose Double Supervised Network with Attention Mecha...

What is wrong with scene text recognition model comparisons? dataset and model analysis

Many new proposals for scene text recognition (STR) models have been int...

Context-Free TextSpotter for Real-Time and Mobile End-to-End Text Detection and Recognition

In the deployment of scene-text spotting systems on mobile platforms, li...

On-Device Spatial Attention based Sequence Learning Approach for Scene Text Script Identification

Automatic identification of script is an essential component of a multil...

DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device Text-to-Speech

With the number of smart devices increasing, the demand for on-device te...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.