Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition

08/15/2020
by   Henry Tsai, et al.
14

Transformer-based models have achieved stateof-the-art results in many tasks in natural language processing. However, such models are usually slow at inference time, making deployment difficult. In this paper, we develop an efficient algorithm to search for fast models while maintaining model quality. We describe a novel approach to decompose the Transformer architecture into smaller components, and propose a sampling-based one-shot architecture search method to find an optimal model for inference. The model search process is more efficient than alternatives, adding only a small overhead to training time. By applying our methods to BERT-base architectures, we achieve 10 for pre-trained BERT and 70 distilled BERT model on Cloud TPU-v2 with a generally acceptable drop in performance.

READ FULL TEXT
research
01/13/2020

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search

Large pre-trained language models such as BERT have shown their effectiv...
research
10/08/2020

Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks

Neural Architecture Search (NAS) methods, which automatically learn enti...
research
02/12/2021

Optimizing Inference Performance of Transformers on CPUs

The Transformer architecture revolutionized the field of natural languag...
research
06/01/2023

Training-free Neural Architecture Search for RNNs and Transformers

Neural architecture search (NAS) has allowed for the automatic creation ...
research
10/20/2020

Optimal Subarchitecture Extraction For BERT

We extract an optimal subset of architectural parameters for the BERT ar...
research
09/22/2020

AutoRC: Improving BERT Based Relation Classification Models via Architecture Search

Although BERT based relation classification (RC) models have achieved si...
research
07/15/2021

AutoBERT-Zero: Evolving BERT Backbone from Scratch

Transformer-based pre-trained language models like BERT and its variants...

Please sign up or login with your details

Forgot password? Click here to reset