Systematic Architectural Design of Scale Transformed Attention Condenser DNNs via Multi-Scale Class Representational Response Similarity Analysis

06/16/2023
by   Andre Hryniowski, et al.
0

Self-attention mechanisms are commonly included in a convolutional neural networks to achieve an improved efficiency performance balance. However, adding self-attention mechanisms adds additional hyperparameters to tune for the application at hand. In this work we propose a novel type of DNN analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim) which can be used to identify specific design interventions that lead to more efficient self-attention convolutional neural network architectures. Using insights grained from ClassRepSim we propose the Spatial Transformed Attention Condenser (STAC) module, a novel attention-condenser based self-attention module. We show that adding STAC modules to ResNet style architectures can result in up to a 1.6 vanilla ResNet models and up to a 0.5 SENet models on the ImageNet64x64 dataset, at the cost of up to 1.7 in FLOPs and 2x the number of parameters. In addition, we demonstrate that results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance compared to an extensive parameter search.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2021

Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions

Attention mechanisms have raised significant interest in the research co...
research
06/03/2022

EAANet: Efficient Attention Augmented Convolutional Networks

Humans can effectively find salient regions in complex scenes. Self-atte...
research
10/12/2021

MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network Architecture for Medical Image Analysis

Medical image analysis continues to hold interesting challenges given th...
research
08/06/2020

Joint Self-Attention and Scale-Aggregation for Self-Calibrated Deraining Network

In the field of multimedia, single image deraining is a basic pre-proces...
research
07/09/2022

Attention and Self-Attention in Random Forests

New models of random forests jointly using the attention and self-attent...
research
04/13/2023

ASR: Attention-alike Structural Re-parameterization

The structural re-parameterization (SRP) technique is a novel deep learn...
research
11/24/2021

NAM: Normalization-based Attention Module

Recognizing less salient features is the key for model compression. Howe...

Please sign up or login with your details

Forgot password? Click here to reset