Processing Megapixel Images with Deep Attention-Sampling Models

05/03/2019
by   Angelos Katharopoulos, et al.
26

Existing deep architectures cannot operate on very large signals such as megapixel images due to computational and memory constraints. To tackle this limitation, we propose a fully differentiable end-to-end trainable model that samples and processes only a fraction of the full resolution input image. The locations to process are sampled from an attention distribution computed from a low resolution view of the input. We refer to our method as attention sampling and it can process images of several megapixels with a standard single GPU setup. We show that sampling from the attention distribution results in an unbiased estimator of the full model with minimal variance, and we derive an unbiased estimator of the gradient that we use to train our model end-to-end with a normal SGD procedure. This new method is evaluated on three classification tasks, where we show that it allows to reduce computation and memory footprint by an order of magnitude for the same accuracy as classical architectures. We also show the consistency of the sampling that indeed focuses on informative parts of the input images.

READ FULL TEXT

page 1

page 5

page 7

page 12

page 13

page 14

page 15

research
06/04/2021

Efficient Classification of Very Large Images with Tiny Objects

An increasing number of applications in the computer vision domain, spec...
research
11/20/2018

Finding a Needle in the Haystack: Attention-Based Classification of High Resolution Microscopy Images

Deep learning for classification of microscopy images is challenging bec...
research
01/31/2023

Patch Gradient Descent: Training Neural Networks on Very Large Images

Traditional CNN models are trained and tested on relatively low resoluti...
research
03/22/2018

End-to-End Learning for the Deep Multivariate Probit Model

The multivariate probit model (MVP) is a popular classic model for study...
research
02/19/2023

Mimicking a Pathologist: Dual Attention Model for Scoring of Gigapixel Histology Images

Some major challenges associated with the automated processing of whole ...
research
07/30/2020

End-to-end Full Projector Compensation

Full projector compensation aims to modify a projector input image to co...
research
06/09/2020

ComboNet: Combined 2D 3D Architecture for Aorta Segmentation

3D segmentation with deep learning if trained with full resolution is th...

Please sign up or login with your details

Forgot password? Click here to reset