TorchScale: Transformers at Scale

11/23/2022
by   Shuming Ma, et al.
0

Large Transformers have achieved state-of-the-art performance across many tasks. Most open-source libraries on scaling Transformers focus on improving training or inference with better parallelization. In this work, we present TorchScale, an open-source toolkit that allows researchers and developers to scale up Transformers efficiently and effectively. TorchScale has the implementation of several modeling techniques, which can improve modeling generality and capability, as well as training stability and efficiency. Experimental results on language modeling and neural machine translation demonstrate that TorchScale can successfully scale Transformers to different sizes without tears. The library is available at https://aka.ms/torchscale.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2020

Deep Transformers with Latent Depth

The Transformer model has achieved state-of-the-art performance in many ...
research
02/17/2018

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

This paper presented an open-source neural machine translation toolkit n...
research
06/30/2020

Data Movement Is All You Need: A Case Study on Optimizing Transformers

Transformers have become widely used for language modeling and sequence ...
research
04/01/2019

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

fairseq is an open-source sequence modeling toolkit that allows research...
research
12/15/2017

Sockeye: A Toolkit for Neural Machine Translation

We describe Sockeye (version 1.12), an open-source sequence-to-sequence ...
research
02/09/2023

Binarized Neural Machine Translation

The rapid scaling of language models is motivating research using low-bi...
research
02/21/2020

Accessing Higher-level Representations in Sequential Transformers with Feedback Memory

Transformers are feedforward networks that can process input tokens in p...

Please sign up or login with your details

Forgot password? Click here to reset