LightSeq: A High Performance Inference Library for Transformers

10/23/2020
by   Xiaohui Wang, et al.
13

Transformer, BERT and their variants have achieved great success in natural language processing. Since Transformer models are huge in size, serving these models is a challenge for real industrial applications. In this paper, we propose LightSeq, a highly efficient inference library for models in the Transformer family. LightSeq includes a series of GPU optimization techniques to to streamline the computation of neural layers and to reduce memory footprint. LightSeq can easily import models trained using PyTorch and Tensorflow. Experimental results on machine translation benchmarks show that LightSeq achieves up to 14x speedup compared with TensorFlow and 1.4x compared with FasterTransformer, a concurrent CUDA implementation. The code is available at https://github.com/bytedance/lightseq.

READ FULL TEXT

page 3

page 4

page 5

research
05/28/2020

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Transformers are ubiquitous in Natural Language Processing (NLP) tasks, ...
research
06/28/2023

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

In recent years, Transformer-based language models have become the stand...
research
05/11/2023

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

Vision transformers have shown great success due to their high model cap...
research
06/17/2021

DeepLab2: A TensorFlow Library for Deep Labeling

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a ...
research
10/12/2021

LightSeq2: Accelerated Training for Transformer-based Models on GPUs

Transformer-based neural models are used in many AI applications. Traini...
research
10/06/2022

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs

Transformer is the cornerstone model of Natural Language Processing (NLP...
research
06/08/2021

FastSeq: Make Sequence Generation Faster

Transformer-based models have made tremendous impacts in natural languag...

Please sign up or login with your details

Forgot password? Click here to reset