Improving One-shot NAS by Suppressing the Posterior Fading

10/06/2019
by   Xiang Li, et al.
15

There is a growing interest in automated neural architecture search (NAS). To improve the efficiency of NAS, previous approaches adopt weight sharing method to force all models share the same set of weights. However, it has been observed that a model performing better with shared weights does not necessarily perform better when trained alone. In this paper, we analyse existing weight sharing one-shot NAS approaches from a Bayesian point of view and identify the posterior fading problem, which compromises the effectiveness of shared weights. To alleviate this problem, we present a practical approach to guide the parameter posterior towards its true distribution. Moreover, a hard latency constraint is introduced during the search so that the desired latency can be achieved. The resulted method, namely Posterior Convergent NAS (PC-NAS), achieves state-of-the-art performance under standard GPU latency constraint on ImageNet. In our small search space, our model PC-NAS-S attains 76.8 latency. When adopted to the large search space, PC-NAS-L achieves 78.1 accuracy within 11ms. The discovered architecture also transfers well to other computer vision applications such as object detection and person re-identification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2020

HNAS: Hierarchical Neural Architecture Search on Mobile Devices

Neural Architecture Search (NAS) has attracted growing interest. To redu...
research
05/21/2020

Powering One-shot Topological NAS with Stabilized Share-parameter Proxy

One-shot NAS method has attracted much interest from the research commun...
research
06/11/2021

K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

In one-shot weight sharing for NAS, the weights of each operation (at ea...
research
06/08/2023

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

Weight-sharing supernet has become a vital component for performance est...
research
02/10/2021

Locally Free Weight Sharing for Network Width Search

Searching for network width is an effective way to slim deep neural netw...
research
03/22/2020

BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of Channels

One-Shot methods have evolved into one of the most popular methods in Ne...
research
12/24/2021

DARTS without a Validation Set: Optimizing the Marginal Likelihood

The success of neural architecture search (NAS) has historically been li...

Please sign up or login with your details

Forgot password? Click here to reset