Fisher Kernel for Deep Neural Activations

12/04/2014
by   Donggeun Yoo, et al.
0

Compared to image representation based on low-level local descriptors, deep neural activations of Convolutional Neural Networks (CNNs) are richer in mid-level representation, but poorer in geometric invariance properties. In this paper, we present a straightforward framework for better image representation by combining the two approaches. To take advantages of both representations, we propose an efficient method to extract a fair amount of multi-scale dense local activations from a pre-trained CNN. We then aggregate the activations by Fisher kernel framework, which has been modified with a simple scale-wise normalization essential to make it suitable for CNN activations. Replacing the direct use of a single activation vector with our representation demonstrates significant performance improvements: +17.76 (Acc.) on MIT Indoor 67 and +7.18 (mAP) on PASCAL VOC 2007. The results suggest that our proposal can be used as a primary image representation for better performances in visual recognition tasks.

READ FULL TEXT

page 2

page 8

page 9

research
03/30/2020

Co-occurrence of deep convolutional features for image search

Image search can be tackled using deep features from pre-trained Convolu...
research
07/26/2020

Learning and aggregating deep local descriptors for instance-level recognition

We propose an efficient method to learn deep local descriptors for insta...
research
11/12/2015

When Naïve Bayes Nearest Neighbours Meet Convolutional Neural Networks

Since Convolutional Neural Networks (CNNs) have become the leading learn...
research
12/26/2022

On the Level Sets and Invariance of Neural Tuning Landscapes

Visual representations can be defined as the activations of neuronal pop...
research
04/26/2020

When CNNs Meet Random RNNs: Towards Multi-Level Analysis for RGB-D Object and Scene Recognition

Recognizing objects and scenes are two challenging but essential tasks i...
research
06/21/2015

Mining Mid-level Visual Patterns with Deep CNN Activations

The purpose of mid-level visual element discovery is to find clusters of...
research
03/21/2016

Deep Self-Convolutional Activations Descriptor for Dense Cross-Modal Correspondence

We present a novel descriptor, called deep self-convolutional activation...

Please sign up or login with your details

Forgot password? Click here to reset