Revisiting the Train Loss: an Efficient Performance Estimator for Neural Architecture Search

06/08/2020
by   Binxin Ru, et al.
22

Reliable yet efficient evaluation of generalisation performance of a proposed architecture is crucial to the success of neural architecture search (NAS). Traditional approaches face a variety of limitations: training each architecture to completion is prohibitively expensive, early stopping estimates may correlate poorly with fully trained performance, and model-based estimators require large training sets. Instead, motivated by recent results linking training speed and generalisation with stochastic gradient descent, we propose to estimate the final test performance based on the sum of training losses. Our estimator is inspired by the marginal likelihood, which is used for Bayesian model selection. Our model-free estimator is simple, efficient, and cheap to implement, and does not require hyperparameter-tuning or surrogate training before deployment. We demonstrate empirically that our estimator consistently outperforms other baselines and can achieve a rank correlation of 0.95 with final test accuracy on the NAS-Bench201 dataset within 50 epochs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2023

Shrink-Perturb Improves Architecture Mixing during Population Based Training for Neural Architecture Search

In this work, we show that simultaneously training and mixing neural net...
research
06/13/2020

Neural Architecture Search using Bayesian Optimisation with Weisfeiler-Lehman Kernel

Bayesian optimisation (BO) has been widely used for hyperparameter optim...
research
05/30/2017

Accelerating Neural Architecture Search using Performance Prediction

Methods for neural network hyperparameter optimization and meta-modeling...
research
12/24/2021

DARTS without a Validation Set: Optimizing the Marginal Likelihood

The success of neural architecture search (NAS) has historically been li...
research
07/18/2018

Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search

While existing work on neural architecture search (NAS) tunes hyperparam...
research
08/12/2021

DARTS for Inverse Problems: a Study on Hyperparameter Sensitivity

Differentiable architecture search (DARTS) is a widely researched tool f...
research
02/23/2022

Bayesian Model Selection, the Marginal Likelihood, and Generalization

How do we compare between hypotheses that are entirely consistent with o...

Please sign up or login with your details

Forgot password? Click here to reset