Bilinear CNNs for Fine-grained Visual Recognition

04/29/2015
by   Tsung-Yu Lin, et al.
0

We present a simple and effective architecture for fine-grained visual recognition called Bilinear Convolutional Neural Networks (B-CNNs). These networks represent an image as a pooled outer product of features derived from two CNNs and capture localized feature interactions in a translationally invariant manner. B-CNNs belong to the class of orderless texture representations but unlike prior work they can be trained in an end-to-end manner. Our most accurate model obtains 84.1 accuracy on the Caltech-UCSD birds [67], NABirds [64], FGVC aircraft [42], and Stanford cars [33] dataset respectively and runs at 30 frames-per-second on a NVIDIA Titan X GPU. We then present a systematic analysis of these networks and show that (1) the bilinear features are highly redundant and can be reduced by an order of magnitude in size without significant loss in accuracy, (2) are also effective for other image classification tasks such as texture and scene recognition, and (3) can be trained from scratch on the ImageNet dataset offering consistent improvements over the baseline architecture. Finally, we present visualizations of these models on various datasets using top activations of neural units and gradient-based inversion techniques. The source code for the complete system is available at http://vis-www.cs.umass.edu/bcnn.

READ FULL TEXT

page 8

page 11

page 12

page 13

page 14

research
09/18/2017

Where to Focus: Deep Attention-based Spatially Recurrent Bilinear Networks for Fine-Grained Visual Recognition

Fine-grained visual recognition typically depends on modeling subtle dif...
research
04/07/2019

Compare More Nuanced:Pairwise Alignment Bilinear Network For Few-shot Fine-grained Learning

The recognition ability of human beings is developed in a progressive wa...
research
11/16/2016

Low-rank Bilinear Pooling for Fine-Grained Classification

Pooling second-order local feature statistics to form a high-dimensional...
research
11/09/2019

Learning Deep Bilinear Transformation for Fine-grained Image Representation

Bilinear feature transformation has shown the state-of-the-art performan...
research
07/21/2017

Improved Bilinear Pooling with CNNs

Bilinear pooling of Convolutional Neural Network (CNN) features [22, 23]...
research
03/16/2023

ELFIS: Expert Learning for Fine-grained Image Recognition Using Subsets

Fine-Grained Visual Recognition (FGVR) tackles the problem of distinguis...
research
06/08/2021

White Paper Assistance: A Step Forward Beyond the Shortcut Learning

The promising performances of CNNs often overshadow the need to examine ...

Please sign up or login with your details

Forgot password? Click here to reset