Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network

01/01/2018
by   Ankan Kumar Bhunia, et al.
0

Script identification facilitates many important applications in document/video analysis. This paper focuses on the problem of script identification in scene text images and video scripts. Because of low image quality, complex background and similar layout of characters shared by some scripts like Greek, Latin, etc., text recognition in such scenario is difficult. Most of the recent approaches usually apply a patch-based CNN network with summation of obtained features, or only a CNN-LSTM network to get the identification result. Some use a discriminative CNN to jointly optimize mid-level representations and deep features. In this paper, we propose a novel method that involves extraction of local and global features using CNN-LSTM framework and weighting them dynamically for script identification. First we convert the images into patches and feed them into a CNN-LSTM framework. Attention-based patch weights are calculated applying softmax layer after LSTM. Then we do patch-wise multiplication of these weights with corresponding CNN to yield local features. Global features are also extracted from last cell state of LSTM. We employ a fusion technique which dynamically weights the local and global features for an individual patch. Experiments have been done in two public script identification datasets, SIW-13 and CVSI2015. Our learning procedure achieves superior performance compared with previous approaches.

READ FULL TEXT

page 2

page 16

page 21

page 23

page 24

page 25

research
02/24/2016

Improving patch-based scene text script identification with ensembles of conjoined networks

This paper focuses on the problem of script identification in scene text...
research
12/01/2021

On-Device Spatial Attention based Sequence Learning Approach for Scene Text Script Identification

Automatic identification of script is an essential component of a multil...
research
12/09/2019

Patch Aggregator for Scene Text Script Identification

Script identification in the wild is of great importance in a multi-ling...
research
11/20/2021

Exploiting Multi-Scale Fusion, Spatial Attention and Patch Interaction Techniques for Text-Independent Writer Identification

Text independent writer identification is a challenging problem that dif...
research
07/12/2021

GiT: Graph Interactive Transformer for Vehicle Re-identification

Transformers are more and more popular in computer vision, which treat a...
research
06/21/2016

DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification

Text-independent writer identification is challenging due to the huge va...
research
01/11/2019

Feature Fusion for Robust Patch Matching With Compact Binary Descriptors

This work addresses the problem of learning compact yet discriminative p...

Please sign up or login with your details

Forgot password? Click here to reset