SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

07/21/2022
by   Chien-Yu Lin, et al.
3

Recent isotropic networks, such as ConvMixer and vision transformers, have found significant success across visual recognition tasks, matching or outperforming non-isotropic convolutional neural networks (CNNs). Isotropic architectures are particularly well-suited to cross-layer weight sharing, an effective neural network compression technique. In this paper, we perform an empirical evaluation on methods for sharing parameters in isotropic networks (SPIN). We present a framework to formalize major weight sharing design decisions and perform a comprehensive empirical evaluation of this design space. Guided by our experimental results, we propose a weight sharing strategy to generate a family of models with better overall efficiency, in terms of FLOPs and parameters versus accuracy, compared to traditional scaling methods alone, for example compressing ConvMixer by 1.9x while improving accuracy on ImageNet. Finally, we perform a qualitative study to further understand the behavior of weight sharing in isotropic architectures. The code is available at https://github.com/apple/ml-spin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2022

Can CNNs Be More Robust Than Transformers?

The recent success of Vision Transformers is shaking the long dominance ...
research
05/30/2023

Are Large Kernels Better Teachers than Transformers for ConvNets?

This paper reveals a new appeal of the recently emerged large-kernel Con...
research
09/30/2022

MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features

MobileViT (MobileViTv1) combines convolutional neural networks (CNNs) an...
research
07/23/2020

WeightNet: Revisiting the Design Space of Weight Networks

We present a conceptually simple, flexible and effective framework for w...
research
05/17/2022

ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

Neural networks (NNs) with intensive multiplications (e.g., convolutions...
research
08/22/2016

Lets keep it simple, Using simple architectures to outperform deeper and more complex architectures

Major winning Convolutional Neural Networks (CNNs), such as AlexNet, VGG...
research
02/06/2013

Cost-Sharing in Bayesian Knowledge Bases

Bayesian knowledge bases (BKBs) are a generalization of Bayes networks a...

Please sign up or login with your details

Forgot password? Click here to reset