Small Temperature is All You Need for Differentiable Architecture Search

06/12/2023
by   Jiuling Zhang, et al.
0

Differentiable architecture search (DARTS) yields highly efficient gradient-based neural architecture search (NAS) by relaxing the discrete operation selection to optimize continuous architecture parameters that maps NAS from the discrete optimization to a continuous problem. DARTS then remaps the relaxed supernet back to the discrete space by one-off post-search pruning to obtain the final architecture (finalnet). Some emerging works argue that this remap is inherently prone to mismatch the network between training and evaluation which leads to performance discrepancy and even model collapse in extreme cases. We propose to close the gap between the relaxed supernet in training and the pruned finalnet in evaluation through utilizing small temperature to sparsify the continuous distribution in the training phase. To this end, we first formulate sparse-noisy softmax to get around gradient saturation. We then propose an exponential temperature schedule to better control the outbound distribution and elaborate an entropy-based adaptive scheme to finally achieve the enhancement. We conduct extensive experiments to verify the efficiency and efficacy of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2023

Robustifying DARTS by Eliminating Information Bypass Leakage via Explicit Sparse Regularization

Differentiable architecture search (DARTS) is a promising end to end NAS...
research
06/18/2020

DrNAS: Dirichlet Neural Architecture Search

This paper proposes a novel differentiable architecture search method by...
research
08/10/2021

Rethinking Architecture Selection in Differentiable NAS

Differentiable Neural Architecture Search is one of the most popular Neu...
research
08/10/2020

RARTS: a Relaxed Architecture Search Method

Differentiable architecture search (DARTS) is an effective method for da...
research
09/28/2021

Delve into the Performance Degradation of Differentiable Architecture Search

Differentiable architecture search (DARTS) is widely considered to be ea...
research
05/06/2019

Differentiable Architecture Search with Ensemble Gumbel-Softmax

For network architecture search (NAS), it is crucial but challenging to ...
research
08/17/2022

Field-wise Embedding Size Search via Structural Hard Auxiliary Mask Pruning for Click-Through Rate Prediction

Feature embeddings are one of the most essential steps when training dee...

Please sign up or login with your details

Forgot password? Click here to reset