Guided Attention for Large Scale Scene Text Verification

04/23/2018
by   Dafang He, et al.
0

Many tasks are related to determining if a particular text string exists in an image. In this work, we propose a new framework that learns this task in an end-to-end way. The framework takes an image and a text string as input and then outputs the probability of the text string being present in the image. This is the first end-to-end framework that learns such relationships between text and images in scene text area. The framework does not require explicit scene text detection or recognition and thus no bounding box annotations are needed for it. It is also the first work in scene text area that tackles suh a weakly labeled problem. Based on this framework, we developed a model called Guided Attention. Our designed model achieves much better results than several state-of-the-art scene text reading based solutions for a challenging Street View Business Matching task. The task tries to find correct business names for storefront images and the dataset we collected for it is substantially larger, and more challenging than existing scene text dataset. This new real-world task provides a new perspective for studying scene text related problems. We also demonstrate the uniqueness of our task via a comparison between our problem and a typical Visual Question Answering problem.

READ FULL TEXT

page 6

page 8

page 12

page 15

page 18

research
02/21/2023

A3S: Adversarial learning of semantic representations for Scene-Text Spotting

Scene-text spotting is a task that predicts a text area on natural scene...
research
05/01/2021

Generative Art Using Neural Visual Grammars and Dual Encoders

Whilst there are perhaps only a few scientific methods, there seem to be...
research
12/17/2015

Large Scale Business Discovery from Street Level Imagery

Search with local intent is becoming increasingly useful due to the popu...
research
05/17/2022

Text Detection Recognition in the Wild for Robot Localization

Signage is everywhere and a robot should be able to take advantage of si...
research
03/28/2022

Towards End-to-End Unified Scene Text Detection and Layout Analysis

Scene text detection and document layout analysis have long been treated...
research
07/06/2020

Text Recognition – Real World Data and Where to Find Them

We present a method for exploiting weakly annotated images to improve te...
research
11/23/2022

Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification

In this paper, we study the problem of identifying logos of business bra...

Please sign up or login with your details

Forgot password? Click here to reset