WaveMix-Lite: A Resource-efficient Neural Network for Image Analysis

05/28/2022
by   Pranav Jeevan, et al.
0

Gains in the ability to generalize on image analysis tasks for neural networks have come at the cost of increased number of parameters and layers, dataset sizes, training and test computations, and GPU RAM. We introduce a new architecture – WaveMix-Lite – that can generalize on par with contemporary transformers and convolutional neural networks (CNNs) while needing fewer resources. WaveMix-Lite uses 2D-discrete wavelet transform to efficiently mix spatial information from pixels. WaveMix-Lite seems to be a versatile and scalable architectural framework that can be used for multiple vision tasks, such as image classification and semantic segmentation, without requiring significant architectural changes, unlike transformers and CNNs. It is able to meet or exceed several accuracy benchmarks while training on a single GPU. For instance, it achieves state-of-the-art accuracy on five EMNIST datasets, outperforms CNNs and transformers in ImageNet-1K (64×64 images), and achieves an mIoU of 75.32 one-fifth the number parameters and half the GPU RAM of comparable CNNs or transformers. Our experiments show that while the convolutional elements of neural architectures exploit the shift-invariance property of images, new types of layers (e.g., wavelet transform) can exploit additional properties of images, such as scale-invariance and finite spatial extents of objects.

READ FULL TEXT

page 5

page 14

page 17

research
03/29/2021

CvT: Introducing Convolutions to Vision Transformers

We present in this paper a new architecture, named Convolutional vision ...
research
05/20/2018

Wavelet Convolutional Neural Networks

Spatial and spectral approaches are two major approaches for image proce...
research
07/24/2017

Wavelet Convolutional Neural Networks for Texture Classification

Texture classification is an important and challenging problem in many i...
research
10/06/2022

The Lie Derivative for Measuring Learned Equivariance

Equivariance guarantees that a model's predictions capture key symmetrie...
research
06/07/2021

Efficient Training of Visual Transformers with Small-Size Datasets

Visual Transformers (VTs) are emerging as an architectural paradigm alte...
research
06/30/2021

Small in-distribution changes in 3D perspective and lighting fool both CNNs and Transformers

Neural networks are susceptible to small transformations including 2D ro...
research
07/01/2023

WavePaint: Resource-efficient Token-mixer for Self-supervised Inpainting

Image inpainting, which refers to the synthesis of missing regions in an...

Please sign up or login with your details

Forgot password? Click here to reset