A Survey on Efficient Training of Transformers

02/02/2023
by   Bohan Zhuang, et al.
0

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources. This survey provides the first systematic overview of the efficient training of Transformers, covering the recent progress in acceleration arithmetic and hardware, with a focus on the former. We analyze and compare methods that save computation and memory costs for intermediate tensors during training, together with techniques on hardware/algorithm co-design. We finally discuss challenges and promising areas for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2022

Survey on Large Scale Neural Network Training

Modern Deep Neural Networks (DNNs) require significant memory to store w...
research
07/07/2022

Training Transformers Together

The infrastructure necessary for training state-of-the-art models is bec...
research
07/12/2023

Transformers in Reinforcement Learning: A Survey

Transformers have significantly impacted domains like natural language p...
research
09/13/2022

Vision Transformers for Action Recognition: A Survey

Vision transformers are emerging as a powerful tool to solve computer vi...
research
02/22/2021

Position Information in Transformers: An Overview

Transformers are arguably the main workhorse in recent Natural Language ...
research
07/21/2020

SliceOut: Training Transformers and CNNs faster while using less memory

We demonstrate 10-40 EfficientNets, and Transformer models, with minimal...
research
02/06/2023

Computation vs. Communication Scaling for Future Transformers on Future Hardware

Scaling neural network models has delivered dramatic quality gains acros...

Please sign up or login with your details

Forgot password? Click here to reset