Differentiable Patch Selection for Image Recognition

by   Jean-Baptiste Cordonnier, et al.

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand. We propose a method based on a differentiable Top-K operator to select the most relevant parts of the input to efficiently process high resolution images. Our method may be interfaced with any downstream neural network, is able to aggregate information from different patches in a flexible way, and allows the whole model to be trained end-to-end using backpropagation. We show results for traffic sign recognition, inter-patch relationship reasoning, and fine-grained recognition without using object/part bounding box annotations during training.


page 1

page 5

page 6

page 14


Generating Superpixels for High-resolution Images with Decoupled Patch Calibration

Superpixel segmentation has recently seen important progress benefiting ...

Weakly-supervised Discriminative Patch Learning via CNN for Fine-grained Recognition

Research on fine-grained recognition has recently shifted from multistag...

Bag of Visual Words (BoVW) with Deep Features – Patch Classification Model for Limited Dataset of Breast Tumours

Currently, the computational complexity limits the training of high reso...

Gigapixel Histopathological Image Analysis using Attention-based Neural Networks

Although CNNs are widely considered as the state-of-the-art models in va...

Finding a Needle in the Haystack: Attention-Based Classification of High Resolution Microscopy Images

Deep learning for classification of microscopy images is challenging bec...

PatchDropout: Economizing Vision Transformers Using Patch Dropout

Vision transformers have demonstrated the potential to outperform CNNs i...

Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks

We introduce a saliency-based distortion layer for convolutional neural ...