Multi-fidelity Neural Architecture Search with Knowledge Distillation

06/15/2020
by   Ilya Trofimov, et al.
0

Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose to improve low-fidelity evaluations of neural architectures by using a knowledge distillation. Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-100 and ImageNet and study various knowledge distillation methods. We show that training on the small part of a dataset with such a modified loss function leads to a better selection of neural architectures than training with a logistic loss. The proposed low-fidelity evaluations were incorporated into a multi-fidelity search algorithm that outperformed the search based on high-fidelity evaluations only (training on a full dataset).

READ FULL TEXT
research
11/29/2019

Towards Oracle Knowledge Distillation with Neural Architecture Search

We present a novel framework of knowledge distillation that is capable o...
research
01/19/2023

RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation

Deep Neural Networks are vulnerable to adversarial attacks. Neural Archi...
research
02/11/2023

Improving Differentiable Architecture Search via Self-Distillation

Differentiable Architecture Search (DARTS) is a simple yet efficient Neu...
research
02/19/2020

Knapsack Pruning with Inner Distillation

Neural network pruning reduces the computational cost of an over-paramet...
research
06/27/2022

Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search

Knowledge Distillation (KD) has recently emerged as a popular method for...
research
02/03/2023

Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits

The use of Neural Architecture Search (NAS) techniques to automate the d...
research
03/19/2022

Emulating Quantum Dynamics with Neural Networks via Knowledge Distillation

High-fidelity quantum dynamics emulators can be used to predict the time...

Please sign up or login with your details

Forgot password? Click here to reset