Controlling Computation versus Quality for Neural Sequence Models

02/17/2020
by   Ankur Bapna, et al.
0

Most neural networks utilize the same amount of compute for every example independent of the inherent complexity of the input. Further, methods that adapt the amount of computation to the example focus on finding a fixed inference-time computational graph per example, ignoring any external computational budgets or varying inference time limitations. In this work, we utilize conditional computation to make neural sequence models (Transformer) more efficient and computation-aware during inference. We first modify the Transformer architecture, making each set of operations conditionally executable depending on the output of a learned control network. We then train this model in a multi-task setting, where each task corresponds to a particular computation budget. This allows us to train a single model that can be controlled to operate on different points of the computation-quality trade-off curve, depending on the available computation budget at inference time. We evaluate our approach on two tasks: (i) WMT English-French Translation and (ii) Unsupervised representation learning (BERT). Our experiments demonstrate that the proposed Conditional Computation Transformer (CCT) is competitive with vanilla Transformers when allowed to utilize its full computational budget, while improving significantly over computationally equivalent baselines when operating on smaller computational budgets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2019

Depth-Adaptive Transformer

State of the art sequence-to-sequence models perform a fixed number of c...
research
11/18/2021

Dynamic-TinyBERT: Boost TinyBERT's Inference Efficiency by Dynamic Sequence Length

Limited computational budgets often prevent transformers from being used...
research
10/14/2020

Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search

Although transformers have achieved impressive accuracies in various tas...
research
06/29/2021

Multi-Exit Vision Transformer for Dynamic Inference

Deep neural networks can be converted to multi-exit architectures by ins...
research
03/23/2022

Linearizing Transformer with Key-Value Memory Bank

Transformer has brought great success to a wide range of natural languag...
research
03/20/2023

Memorization Capacity of Neural Networks with Conditional Computation

Many empirical studies have demonstrated the performance benefits of con...
research
01/30/2023

Alternating Updates for Efficient Transformers

It is well established that increasing scale in deep transformer network...

Please sign up or login with your details

Forgot password? Click here to reset