NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference

12/03/2021
by   Joonsang Yu, et al.
0

Non-linear operations such as GELU, Layer normalization, and Softmax are essential yet costly building blocks of Transformer models. Several prior works simplified these operations with look-up tables or integer computations, but such approximations suffer inferior accuracy or considerable hardware cost with long latency. This paper proposes an accurate and hardware-friendly approximation framework for efficient Transformer inference. Our framework employs a simple neural network as a universal approximator with its structure equivalently transformed into a LUT. The proposed framework called NN-LUT can accurately replace all the non-linear operations in popular BERT models with significant reductions in area, power consumption, and latency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2023

An Efficient FPGA-Based Accelerator for Swin Transformer

Since introduced, Swin Transformer has achieved remarkable results in th...
research
01/05/2021

I-BERT: Integer-only BERT Quantization

Transformer based models, like BERT and RoBERTa, have achieved state-of-...
research
02/27/2023

Full Stack Optimization of Transformer Inference: a Survey

Recent advances in state-of-the-art DNN architecture design have been mo...
research
05/25/2019

LUTNet: speeding up deep neural network inferencing via look-up tables

We consider the use of look-up tables (LUT) to speed up and simplify the...
research
08/19/2023

East: Efficient and Accurate Secure Transformer Framework for Inference

Transformer has been successfully used in practical applications, such a...
research
02/07/2023

LUT-NN: Towards Unified Neural Network Inference by Table Lookup

DNN inference requires huge effort of system development and resource co...
research
07/05/2021

Popcorn: Paillier Meets Compression For Efficient Oblivious Neural Network Inference

Oblivious inference enables the cloud to provide neural network inferenc...

Please sign up or login with your details

Forgot password? Click here to reset