Balanced Mixture of SuperNets for Learning the CNN Pooling Architecture

06/21/2023
by   Mehraveh Javan, et al.
0

Downsampling layers, including pooling and strided convolutions, are crucial components of the convolutional neural network architecture that determine both the granularity/scale of image feature analysis as well as the receptive field size of a given layer. To fully understand this problem, we analyse the performance of models independently trained with each pooling configurations on CIFAR10, using a ResNet20 network, and show that the position of the downsampling layers can highly influence the performance of a network and predefined downsampling configurations are not optimal. Network Architecture Search (NAS) might be used to optimize downsampling configurations as an hyperparameter. However, we find that common one-shot NAS based on a single SuperNet does not work for this problem. We argue that this is because a SuperNet trained for finding the optimal pooling configuration fully shares its parameters among all pooling configurations. This makes its training hard, because learning some configurations can harm the performance of others. Therefore, we propose a balanced mixture of SuperNets that automatically associates pooling configurations to different weight models and helps to reduce the weight-sharing and inter-influence of pooling configurations on the SuperNet parameters. We evaluate our proposed approach on CIFAR10, CIFAR100, as well as Food101 and show that in all cases, our model outperforms other approaches and improves over the default pooling configurations.

READ FULL TEXT
research
04/11/2018

Detail-Preserving Pooling in Deep Networks

Most convolutional neural networks use some method for gradually downsca...
research
02/03/2022

Learning strides in convolutional neural networks

Convolutional neural networks typically contain several downsampling ope...
research
06/08/2023

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

Weight-sharing supernet has become a vital component for performance est...
research
07/17/2023

ShiftNAS: Improving One-shot NAS via Probability Shift

One-shot Neural architecture search (One-shot NAS) has been proposed as ...
research
10/19/2021

NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training Hyperparameters

The benchmark datasets for neural architecture search (NAS) have been de...
research
06/29/2020

The Heterogeneity Hypothesis: Finding Layer-Wise Dissimilated Network Architecture

In this paper, we tackle the problem of convolutional neural network des...
research
02/02/2021

Size Matters

Fully convolutional neural networks can process input of arbitrary size ...

Please sign up or login with your details

Forgot password? Click here to reset