Learning Inception Attention for Image Synthesis and Image Recognition

12/29/2021
by   Jianghao Shen, et al.
10

Image synthesis and image recognition have witnessed remarkable progress, but often at the expense of computationally expensive training and inference. Learning lightweight yet expressive deep model has emerged as an important and interesting direction. Inspired by the well-known split-transform-aggregate design heuristic in the Inception building block, this paper proposes a Skip-Layer Inception Module (SLIM) that facilitates efficient learning of image synthesis models, and a same-layer variant (dubbed as SLIM too) as a stronger alternative to the well-known ResNeXts for image recognition. In SLIM, the input feature map is first split into a number of groups (e.g., 4).Each group is then transformed to a latent style vector(via channel-wise attention) and a latent spatial mask (via spatial attention). The learned latent masks and latent style vectors are aggregated to modulate the target feature map. For generative learning, SLIM is built on a recently proposed lightweight Generative Adversarial Networks (i.e., FastGANs) which present a skip-layer excitation(SLE) module. For few-shot image synthesis tasks, the proposed SLIM achieves better performance than the SLE work and other related methods. For one-shot image synthesis tasks, it shows stronger capability of preserving images structures than prior arts such as the SinGANs. For image classification tasks, the proposed SLIM is used as a drop-in replacement for convolution layers in ResNets (resulting in ResNeXt-like models) and achieves better accuracy in theImageNet-1000 dataset, with significantly smaller model complexity

READ FULL TEXT

page 14

page 15

page 16

page 18

page 19

page 20

page 21

page 22

research
03/25/2020

Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis

With the remarkable recent progress on learning deep generative models, ...
research
08/02/2021

S^2-MLPv2: Improved Spatial-Shift MLP Architecture for Vision

Recently, MLP-based vision backbones emerge. MLP-based vision architectu...
research
03/24/2023

Factor Decomposed Generative Adversarial Networks for Text-to-Image Synthesis

Prior works about text-to-image synthesis typically concatenated the sen...
research
03/29/2023

WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models

Text-to-Image synthesis is the task of generating an image according to ...
research
05/13/2022

ImageSig: A signature transform for ultra-lightweight image recognition

This paper introduces a new lightweight method for image recognition. Im...
research
04/18/2020

Example-Guided Image Synthesis across Arbitrary Scenes using Masked Spatial-Channel Attention and Self-Supervision

Example-guided image synthesis has recently been attempted to synthesize...
research
11/20/2019

An Inception Inspired Deep Network to Analyse Fundus Images

A fundus image usually contains the optic disc, pathologies and other st...

Please sign up or login with your details

Forgot password? Click here to reset