Local Learning on Transformers via Feature Reconstruction

12/29/2022
by   Priyank Pathak, et al.
0

Transformers are becoming increasingly popular due to their superior performance over conventional convolutional neural networks(CNNs). However, transformers usually require a much larger amount of memory to train than CNNs, which prevents their application in many low resource settings. Local learning, which divides the network into several distinct modules and trains them individually, is a promising alternative to the end-to-end (E2E) training approach to reduce the amount of memory for training and to increase parallelism. This paper is the first to apply Local Learning on transformers for this purpose. The standard CNN-based local learning method, InfoPro [32], reconstructs the input images for each module in a CNN. However, reconstructing the entire image does not generalize well. In this paper, we propose a new mechanism for each local module, where instead of reconstructing the entire image, we reconstruct its input features, generated from previous modules. We evaluate our approach on 4 commonly used datasets and 3 commonly used decoder structures on Swin-Tiny. The experiments show that our approach outperforms InfoPro-Transformer, the InfoPro with Transfomer backbone we introduced, by at up to 0.58 12 when the network is divided into 2 modules and 45 network is divided into 4 modules.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2021

3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers

3D reconstruction aims to reconstruct 3D objects from 2D views. Previous...
research
02/16/2023

Efficient 3D Object Reconstruction using Visual Transformers

Reconstructing a 3D object from a 2D image is a well-researched vision p...
research
05/31/2022

ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation

Generating a detailed near-field perceptual model of the environment is ...
research
06/02/2021

Container: Context Aggregation Network

Convolutional neural networks (CNNs) are ubiquitous in computer vision, ...
research
12/16/2022

Hippocampus-Inspired Cognitive Architecture (HICA) for Operant Conditioning

The neural implementation of operant conditioning with few trials is unc...
research
03/27/2023

MoViT: Memorizing Vision Transformers for Medical Image Analysis

The synergy of long-range dependencies from transformers and local repre...
research
01/11/2023

Dynamic Background Reconstruction via Transformer for Infrared Small Target Detection

Infrared small target detection (ISTD) under complex backgrounds is a di...

Please sign up or login with your details

Forgot password? Click here to reset