Integration of Text-maps in Convolutional Neural Networks for Region Detection among Different Textual Categories

05/26/2019
by   Roberto Arroyo, et al.
0

In this work, we propose a new technique that combines appearance and text in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual image to provide a much richer input to the network. Text-maps are colored with different intensities depending on the relevance of the words recognized over the image. Concretely, these words are previously extracted using Optical Character Recognition (OCR) and they are colored according to the probability of belonging to a textual category of interest. In this sense, this solution is especially relevant in the context of item coding for supermarket products, where different types of textual categories must be identified, such as ingredients or nutritional facts. We evaluated our solution in the proprietary item coding dataset of Nielsen Brandbank, which contains more than 10,000 images for train and 2,000 images for test. The reported results demonstrate that our approach focused on visual and textual data outperforms state-of-the-art algorithms only based on appearance, such as standard Faster R-CNN. These enhancements are reflected in precision and recall, which are improved in 42 and 33 points respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2015

Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions

One of the main challenges in Zero-Shot Learning of visual categories is...
research
03/31/2016

Large Scale Deep Convolutional Neural Network Features Search with Lucene

In this work, we propose an approach to index Deep Convolutional Neural ...
research
04/07/2016

A Novel Scene Text Detection Algorithm Based On Convolutional Neural Network

Candidate text region extraction plays a critical role in convolutional ...
research
11/21/2019

Image Aesthetics Assessment using Multi Channel Convolutional Neural Networks

Image Aesthetics Assessment is one of the emerging domains in research. ...
research
05/08/2017

Scene Text Eraser

The character information in natural scene images contains various perso...
research
06/23/2016

Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions

In this paper we tackle the problem of image search when the query is a ...
research
12/31/2020

A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows from UAV Imagery

In this paper, we propose a novel deep learning method based on a Convol...

Please sign up or login with your details

Forgot password? Click here to reset