SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

06/02/2018
by   Yi-Chao Wu, et al.
0

Scene text recognition has drawn great attentions in the community of computer vision and artificial intelligence due to its challenges and wide applications. State-of-the-art recurrent neural networks (RNN) based models map an input sequence to a variable length output sequence, but are usually applied in a black box manner and lack of transparency for further improvement, and the maintaining of the entire past hidden states prevents parallel computation in a sequence. In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with sliding convolutional attention network (SCAN). Similar to the eye movement during reading, the process of SCAN can be viewed as an alternation between saccades and visual fixations. Compared to the previous recurrent models, computations over all elements of SCAN can be fully parallelized during training. Experimental results on several challenging benchmarks, including the IIIT5k, SVT and ICDAR 2003/2013 datasets, demonstrate the superiority of SCAN over state-of-the-art methods in terms of both the model interpretability and performance.

READ FULL TEXT

page 1

page 2

page 3

page 6

page 9

page 10

research
09/06/2017

Scene Text Recognition with Sliding Convolutional Character Models

Scene text recognition has attracted great interests from the computer v...
research
09/13/2017

Reading Scene Text with Attention Convolutional Sequence Modeling

Reading text in the wild is a challenging task in the field of computer ...
research
04/02/2019

A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition

Reading irregular text of arbitrary shape in natural scene images is sti...
research
12/01/2019

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Although deep neural networks generally have fixed network structures, t...
research
07/21/2015

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Image-based sequence recognition has been a long-standing research topic...
research
12/21/2019

Decoupled Attention Network for Text Recognition

Text recognition has attracted considerable research interests because o...
research
11/22/2016

Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading

Physical library collections are valuable and long standing resources fo...

Please sign up or login with your details

Forgot password? Click here to reset