A^3: Accelerating Attention Mechanisms in Neural Networks with Approximation

02/22/2020
by   Tae Jun Ham, et al.
0

With the increasing computational demands of neural networks, many hardware accelerators for the neural networks have been proposed. Such existing neural network accelerators often focus on popular neural network types such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs); however, not much attention has been paid to attention mechanisms, an emerging neural network primitive that enables neural networks to retrieve most relevant information from a knowledge-base, external memory, or past states. The attention mechanism is widely adopted by many state-of-the-art neural networks for computer vision, natural language processing, and machine translation, and accounts for a large portion of total execution time. We observe today's practice of implementing this mechanism using matrix-vector multiplication is suboptimal as the attention mechanism is semantically a content-based search where a large portion of computations ends up not being used. Based on this observation, we design and architect A3, which accelerates attention mechanisms in neural networks with algorithmic approximation and hardware specialization. Our proposed accelerator achieves multiple orders of magnitude improvement in energy efficiency (performance/watt) as well as substantial speedup over the state-of-the-art conventional hardware.

READ FULL TEXT

page 2

page 3

page 5

page 7

page 8

page 11

page 13

page 14

research
03/22/2023

TRON: Transformer Neural Network Acceleration with Non-Coherent Silicon Photonics

Transformer neural networks are rapidly being integrated into state-of-t...
research
02/15/2022

HiMA: A Fast and Scalable History-based Memory Access Engine for Differentiable Neural Computer

Memory-augmented neural networks (MANNs) provide better inference perfor...
research
09/20/2022

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

Attention-based neural networks have become pervasive in many AI tasks. ...
research
03/13/2023

X-Former: In-Memory Acceleration of Transformers

Transformers have achieved great success in a wide variety of natural la...
research
07/23/2018

PCNNA: A Photonic Convolutional Neural Network Accelerator

Convolutional Neural Networks (CNN) have been the centerpiece of many ap...
research
11/21/2021

Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism

There has been a rapid advance of custom hardware (HW) for accelerating ...
research
12/20/2016

Exploring Different Dimensions of Attention for Uncertainty Detection

Neural networks with attention have proven effective for many natural la...

Please sign up or login with your details

Forgot password? Click here to reset