Efficient Backbone Search for Scene Text Recognition

03/14/2020
by   Hui Zhang, et al.
0

Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes. The community has paid increasing attention to boost the performance by improving the pre-processing image module, like rectification and deblurring, or the sequence translator. However, another critical module, i.e., the feature sequence extractor, has not been extensively explored. In this work, inspired by the success of neural architecture search (NAS), which can identify better architectures than human-designed ones, we propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance. First, we design a domain-specific search space for STR, which contains both choices on operations and constraints on the downsampling path. Then, we propose a two-step search algorithm, which decouples operations and downsampling path, for an efficient search in the given space. Experiments demonstrate that, by searching data-dependent backbones, AutoSTR can outperform the state-of-the-art approaches on standard benchmarks with much fewer FLOPS and model parameters.

READ FULL TEXT
research
09/27/2022

Searching a High-Performance Feature Extractor for Text Recognition Network

Feature extractor plays a critical role in text recognition (TR), but cu...
research
03/29/2021

Rethinking Neural Operations for Diverse Tasks

An important goal of neural architecture search (NAS) is to automate-awa...
research
03/13/2022

Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching

The development of scene text recognition (STR) in the era of deep learn...
research
07/07/2021

GLiT: Neural Architecture Search for Global and Local Image Transformer

We introduce the first Neural Architecture Search (NAS) method to find a...
research
04/15/2022

Efficient Architecture Search for Diverse Tasks

While neural architecture search (NAS) has enabled automated machine lea...
research
10/30/2020

Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation

Panoptic segmentation is posed as a new popular test-bed for the state-o...
research
07/23/2020

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

Convolutional operations have two limitations: (1) do not explicitly mod...

Please sign up or login with your details

Forgot password? Click here to reset