Dynamic Perceiver for Efficient Visual Recognition

06/20/2023
by   Yizeng Han, et al.
0

Early exiting has become a promising approach to improving the inference efficiency of deep networks. By structuring models with multiple classifiers (exits), predictions for “easy” samples can be generated at earlier exits, negating the need for executing deeper layers. Current multi-exit networks typically implement linear classifiers at intermediate layers, compelling low-level features to encapsulate high-level semantics. This sub-optimal design invariably undermines the performance of later exits. In this paper, we propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task with a novel dual-branch architecture. A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks. Bi-directional cross-attention layers are established to progressively fuse the information of both branches. Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features. Dyn-Perceiver constitutes a versatile and adaptable framework that can be built upon various architectures. Experiments on image classification, action recognition, and object detection demonstrate that our method significantly improves the inference efficiency of different backbones, outperforming numerous competitive approaches across a broad range of computational budgets. Evaluation on both CPU and GPU platforms substantiate the superior practical efficiency of Dyn-Perceiver. Code is available at https://www.github.com/LeapLabTHU/Dynamic_Perceiver.

READ FULL TEXT
research
09/17/2022

Learning to Weight Samples for Dynamic Early-exiting Networks

Early exiting is an effective paradigm for improving the inference effic...
research
06/16/2021

Over-and-Under Complete Convolutional RNN for MRI Reconstruction

Reconstructing magnetic resonance (MR) images from undersampled data is ...
research
11/27/2019

Decision Propagation Networks for Image Classification

High-level (e.g., semantic) features encoded in the latter layers of con...
research
10/12/2022

Latency-aware Spatial-wise Dynamic Networks

Spatial-wise dynamic convolution has become a promising approach to impr...
research
03/21/2023

BigSmall: Efficient Multi-Task Learning for Disparate Spatial and Temporal Physiological Measurements

Understanding of human visual perception has historically inspired the d...
research
08/30/2023

Latency-aware Unified Dynamic Networks for Efficient Image Recognition

Dynamic computation has emerged as a promising avenue to enhance the inf...
research
12/02/2014

DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection

Contour detection has been a fundamental component in many image segment...

Please sign up or login with your details

Forgot password? Click here to reset