Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search

10/29/2020
by   Houwen Peng, et al.
0

One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance. However, weight sharing across models has an inherent deficiency, i.e., insufficient training of subnetworks in the hypernetwork. To alleviate this problem, we present a simple yet effective architecture distillation method. The central idea is that subnetworks can learn collaboratively and teach each other throughout the training process, aiming to boost the convergence of individual models. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Distilling knowledge from the prioritized paths is able to boost the training of subnetworks. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop. We directly select the most promising one from the prioritized paths as the final architecture, without using other complex search methods, such as reinforcement learning or evolution algorithms. The experiments on ImageNet verify such path distillation method can improve the convergence ratio and performance of the hypernetwork, as well as boosting the training of subnetworks. The discovered architectures achieve superior performance compared to the recent MobileNetV3 and EfficientNet families under aligned settings. Moreover, the experiments on object detection and more challenging search space show the generality and robustness of the proposed method. Code and models are available at https://github.com/microsoft/cream.git.

READ FULL TEXT
research
01/16/2020

MixPath: A Unified Approach for One-shot Neural Architecture Search

The expressiveness of search space is a key concern in neural architectu...
research
04/01/2021

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

Despite remarkable progress achieved, most neural architecture search (N...
research
11/29/2021

Searching the Search Space of Vision Transformer

Vision Transformer has shown great visual representation power in substa...
research
12/24/2019

BETANAS: BalancEd TrAining and selective drop for Neural Architecture Search

Automatic neural architecture search techniques are becoming increasingl...
research
08/16/2019

ScarletNAS: Bridging the Gap Between Scalability and Fairness in Neural Architecture Search

One-shot neural architecture search features fast training of a supernet...
research
07/03/2019

FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search

The ability to rank models by its real strength is the key to Neural Arc...
research
08/12/2021

DARTS for Inverse Problems: a Study on Hyperparameter Sensitivity

Differentiable architecture search (DARTS) is a widely researched tool f...

Please sign up or login with your details

Forgot password? Click here to reset