ELoPE: Fine-Grained Visual Classification with Efficient Localization, Pooling and Embedding

11/17/2019
by   Harald Hanselmann, et al.
0

The task of fine-grained visual classification (FGVC) deals with classification problems that display a small inter-class variance such as distinguishing between different bird species or car models. State-of-the-art approaches typically tackle this problem by integrating an elaborate attention mechanism or (part-) localization method into a standard convolutional neural network (CNN). Also in this work the aim is to enhance the performance of a backbone CNN such as ResNet by including three efficient and lightweight components specifically designed for FGVC. This is achieved by using global k-max pooling, a discriminative embedding layer trained by optimizing class means and an efficient bounding box estimator that only needs class labels for training. The resulting model achieves new best state-of-the-art recognition accuracies on the Stanford cars and FGVC-Aircraft datasets.

READ FULL TEXT

page 5

page 7

research
05/11/2020

Fine-Grained Visual Classification with Efficient End-to-end Localization

The term fine-grained visual classification (FGVC) refers to classificat...
research
05/22/2017

Training with Confusion for Fine-Grained Visual Classification

Research in Fine-Grained Visual Classification has focused on tackling t...
research
10/15/2018

Vehicle classification using ResNets, localisation and spatially-weighted pooling

We investigate whether ResNet architectures can outperform more traditio...
research
06/10/2021

Multi-resolution Outlier Pooling for Sorghum Classification

Automated high throughput plant phenotyping involves leveraging sensors,...
research
10/28/2019

Fine-Grained Visual Recognition with Batch Confusion Norm

We introduce a regularization concept based on the proposed Batch Confus...
research
07/19/2018

Attend and Rectify: a Gated Attention Mechanism for Fine-Grained Recovery

We propose a novel attention mechanism to enhance Convolutional Neural N...
research
08/09/2016

Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task

We present Mean Box Pooling, a novel visual representation that pools ov...

Please sign up or login with your details

Forgot password? Click here to reset