Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification

10/11/2020
by   Yulin Wang, et al.
11

The accuracy of deep convolutional neural networks (CNNs) generally improves when fueled with high resolution images. However, this often comes at a high computational cost and high memory footprint. Inspired by the fact that not all regions in an image are task-relevant, we propose a novel framework that performs efficient image classification by processing a sequence of relatively small inputs, which are strategically selected from the original image with reinforcement learning. Such a dynamic decision process naturally facilitates adaptive inference at test time, i.e., it can be terminated once the model is sufficiently confident about its prediction and thus avoids further redundant computation. Notably, our framework is general and flexible as it is compatible with most of the state-of-the-art light-weighted CNNs (such as MobileNets, EfficientNets and RegNets), which can be conveniently deployed as the backbone feature extractor. Experiments on ImageNet show that our method consistently improves the computational efficiency of a wide variety of deep models. For example, it further reduces the average latency of the highly efficient MobileNet-V3 on an iPhone XS Max by 20 pre-trained models are available at https://github.com/blackfeather-wang/GFNet-Pytorch.

READ FULL TEXT

page 2

page 8

research
01/09/2022

Glance and Focus Networks for Dynamic Visual Recognition

Spatial redundancy widely exists in visual recognition tasks, i.e., disc...
research
10/12/2022

Latency-aware Spatial-wise Dynamic Networks

Spatial-wise dynamic convolution has become a promising approach to impr...
research
07/29/2020

Fully Dynamic Inference with Deep Neural Networks

Modern deep neural networks are powerful and widely applicable models th...
research
05/07/2021

Adaptive Focus for Efficient Video Recognition

In this paper, we explore the spatial redundancy in video recognition wi...
research
12/02/2021

Temporally Resolution Decrement: Utilizing the Shape Consistency for Higher Computational Efficiency

Image resolution that has close relations with accuracy and computationa...
research
06/07/2022

Localizing Semantic Patches for Accelerating Image Classification

Existing works often focus on reducing the architecture redundancy for a...
research
11/18/2019

ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems

Convolutional neural networks (CNNs) are now predominant components in a...

Please sign up or login with your details

Forgot password? Click here to reset