DeepAI
Log In Sign Up

Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network

01/01/2018
by   Ankan Kumar Bhunia, et al.
0

Script identification facilitates many important applications in document/video analysis. This paper focuses on the problem of script identification in scene text images and video scripts. Because of low image quality, complex background and similar layout of characters shared by some scripts like Greek, Latin, etc., text recognition in such scenario is difficult. Most of the recent approaches usually apply a patch-based CNN network with summation of obtained features, or only a CNN-LSTM network to get the identification result. Some use a discriminative CNN to jointly optimize mid-level representations and deep features. In this paper, we propose a novel method that involves extraction of local and global features using CNN-LSTM framework and weighting them dynamically for script identification. First we convert the images into patches and feed them into a CNN-LSTM framework. Attention-based patch weights are calculated applying softmax layer after LSTM. Then we do patch-wise multiplication of these weights with corresponding CNN to yield local features. Global features are also extracted from last cell state of LSTM. We employ a fusion technique which dynamically weights the local and global features for an individual patch. Experiments have been done in two public script identification datasets, SIW-13 and CVSI2015. Our learning procedure achieves superior performance compared with previous approaches.

READ FULL TEXT

page 2

page 16

page 21

page 23

page 24

page 25

02/24/2016

Improving patch-based scene text script identification with ensembles of conjoined networks

This paper focuses on the problem of script identification in scene text...
12/01/2021

On-Device Spatial Attention based Sequence Learning Approach for Scene Text Script Identification

Automatic identification of script is an essential component of a multil...
12/09/2019

Patch Aggregator for Scene Text Script Identification

Script identification in the wild is of great importance in a multi-ling...
07/12/2021

GiT: Graph Interactive Transformer for Vehicle Re-identification

Transformers are more and more popular in computer vision, which treat a...
06/21/2016

DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification

Text-independent writer identification is challenging due to the huge va...
12/27/2021

Augmenting Convolutional networks with attention-based aggregation

We show how to augment any convolutional network with an attention-based...