Using Caterpillar to Nibble Small-Scale Images

05/28/2023
by   Jin Sun, et al.
0

Recently, MLP-based models have become popular and attained significant performance on medium-scale datasets (e.g., ImageNet-1k). However, their direct applications to small-scale images remain limited. To address this issue, we design a new MLP-based network, namely Caterpillar, by proposing a key module of Shifted-Pillars-Concatenation (SPC) for exploiting the inductive bias of locality. SPC consists of two processes: (1) Pillars-Shift, which is to shift all pillars within an image along different directions to generate copies, and (2) Pillars-Concatenation, which is to capture the local information from discrete shift neighborhoods of the shifted copies. Extensive experiments demonstrate its strong scalability and superior performance on popular small-scale datasets, and the competitive performance on ImageNet-1K to recent state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/02/2021

S^2-MLPv2: Improved Spatial-Shift MLP Architecture for Vision

Recently, MLP-based vision backbones emerge. MLP-based vision architectu...
research
01/26/2022

Training Vision Transformers with Only 2040 Images

Vision Transformers (ViTs) is emerging as an alternative to convolutiona...
research
08/30/2021

Hire-MLP: Vision MLP via Hierarchical Rearrangement

This paper presents Hire-MLP, a simple yet competitive vision MLP archit...
research
12/20/2019

AutoScale: Learning to Scale for Crowd Counting

Crowd counting in images is a widely explored but challenging task. Thou...
research
08/21/2023

Dataset Quantization

State-of-the-art deep neural networks are trained with large amounts (mi...
research
11/20/2022

R2-MLP: Round-Roll MLP for Multi-View 3D Object Recognition

Recently, vision architectures based exclusively on multi-layer perceptr...

Please sign up or login with your details

Forgot password? Click here to reset