Classifying a specific image region using convolutional nets with an ROI mask as input

12/01/2018
by   Sagi Eppel, et al.
0

Convolutional neural nets (CNN) are the leading computer vision method for classifying images. In some cases, it is desirable to classify only a specific region of the image that corresponds to a certain object. Hence, assuming that the region of the object in the image is known in advance and is given as a binary region of interest (ROI) mask, the goal is to classify the object in this region using a convolutional neural net. This goal is achieved using a standard image classification net with the addition of a side branch, which converts the ROI mask into an attention map. This map is then combined with the image classification net. This allows the net to focus the attention on the object region while still extracting contextual cues from the background. This approach was evaluated using the COCO object dataset and the OpenSurfaces materials dataset. In both cases, it gave superior results to methods that completely ignore the background region. In addition, it was found that combining the attention map at the first layer of the net gave better results than combining it at higher layers or multiple layers of the net. The advantages of this method are most apparent in the classification of small regions which demands a great deal of contextual information from the background.

READ FULL TEXT

page 1

page 2

page 4

research
10/14/2017

Hierarchical semantic segmentation using modular convolutional neural networks

Image recognition tasks that involve identifying parts of an object or t...
research
01/30/2018

Diagnose like a Radiologist: Attention Guided Convolutional Neural Network for Thorax Disease Classification

This paper considers the task of thorax disease classification on chest ...
research
08/03/2022

Statistical Attention Localization (SAL): Methodology and Application to Object Classification

A statistical attention localization (SAL) method is proposed to facilit...
research
02/09/2017

EAC-Net: A Region-based Deep Enhancing and Cropping Approach for Facial Action Unit Detection

In this paper, we propose a deep learning based approach for facial acti...
research
08/08/2017

An Error Detection and Correction Framework for Connectomics

We define and study error detection and correction tasks that are useful...
research
03/26/2019

Optimising the Input Image to Improve Visual Relationship Detection

Visual Relationship Detection is defined as, given an image composed of ...

Please sign up or login with your details

Forgot password? Click here to reset