ScaleNet: Searching for the Model to Scale

07/15/2022
by   Jiyang Xie, et al.
0

Recently, community has paid increasing attention on model scaling and contributed to developing a model family with a wide spectrum of scales. Current methods either simply resort to a one-shot NAS manner to construct a non-structural and non-scalable model family or rely on a manual yet fixed scaling strategy to scale an unnecessarily best base model. In this paper, we bridge both two components and propose ScaleNet to jointly search base model and scaling strategy so that the scaled large model can have more promising performance. Concretely, we design a super-supernet to embody models with different spectrum of sizes (e.g., FLOPs). Then, the scaling strategy can be learned interactively with the base model via a Markov chain-based evolution algorithm and generalized to develop even larger models. To obtain a decent super-supernet, we design a hierarchical sampling strategy to enhance its training sufficiency and alleviate the disturbance. Experimental results show our scaled networks enjoy significant performance superiority on various FLOPs, but with at least 2.53x reduction on search cost. Codes are available at https://github.com/luminolx/ScaleNet.

READ FULL TEXT

page 5

page 17

page 24

research
05/25/2023

Revisiting Non-Autoregressive Translation at Scale

In real-world systems, scaling has been critical for improving the trans...
research
11/25/2020

aw_nas: A Modularized and Extensible NAS framework

Neural Architecture Search (NAS) has received extensive attention due to...
research
02/24/2022

Auto-scaling Vision Transformers without Training

This work targets automated designing and scaling of Vision Transformers...
research
12/23/2022

The choice of scaling technique matters for classification performance

Dataset scaling, also known as normalization, is an essential preprocess...
research
02/20/2006

Methods for scaling a large member base

The technical challenges of scaling websites with large and growing memb...
research
08/29/2018

Modelling Langford's Problem: A Viewpoint for Search

The performance of enumerating all solutions to an instance of Langford'...

Please sign up or login with your details

Forgot password? Click here to reset