GPT-NAS: Neural Architecture Search with the Generative Pre-Trained Model

05/09/2023
by   Caiyang Yu, et al.
0

Neural Architecture Search (NAS) has emerged as one of the effective methods to design the optimal neural network architecture automatically. Although neural architectures have achieved human-level performances in several tasks, few of them are obtained from the NAS method. The main reason is the huge search space of neural architectures, making NAS algorithms inefficient. This work presents a novel architecture search algorithm, called GPT-NAS, that optimizes neural architectures by Generative Pre-Trained (GPT) model. In GPT-NAS, we assume that a generative model pre-trained on a large-scale corpus could learn the fundamental law of building neural architectures. Therefore, GPT-NAS leverages the generative pre-trained (GPT) model to propose reasonable architecture components given the basic one. Such an approach can largely reduce the search space by introducing prior knowledge in the search process. Extensive experimental results show that our GPT-NAS method significantly outperforms seven manually designed neural architectures and thirteen architectures provided by competing NAS methods. In addition, our ablation study indicates that the proposed algorithm improves the performance of finely tuned neural architectures by up to about 12 further demonstrating its effectiveness in searching neural architectures.

READ FULL TEXT

page 1

page 6

research
11/25/2019

Binarized Neural Architecture Search

Neural architecture search (NAS) can have a significant impact in comput...
research
10/30/2017

Transfer Learning to Learn with Multitask Neural Model Search

Deep learning models require extensive architecture design exploration a...
research
05/26/2023

DiffusionNAG: Task-guided Neural Architecture Generation with Diffusion Models

Neural Architecture Search (NAS) has emerged as a powerful technique for...
research
12/03/2021

Data-Free Neural Architecture Search via Recursive Label Calibration

This paper aims to explore the feasibility of neural architecture search...
research
02/12/2021

Adversarial Branch Architecture Search for Unsupervised Domain Adaptation

Unsupervised Domain Adaptation (UDA) is a key field in visual recognitio...
research
11/17/2019

Neural Recurrent Structure Search for Knowledge Graph Embedding

Knowledge graph (KG) embedding is a fundamental problem in mining relati...
research
05/25/2021

AutoReCon: Neural Architecture Search-based Reconstruction for Data-free Compression

Data-free compression raises a new challenge because the original traini...

Please sign up or login with your details

Forgot password? Click here to reset