DeepAI
Log In Sign Up

A^3: Accelerating Attention Mechanisms in Neural Networks with Approximation

02/22/2020
by   Tae Jun Ham, et al.
0

With the increasing computational demands of neural networks, many hardware accelerators for the neural networks have been proposed. Such existing neural network accelerators often focus on popular neural network types such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs); however, not much attention has been paid to attention mechanisms, an emerging neural network primitive that enables neural networks to retrieve most relevant information from a knowledge-base, external memory, or past states. The attention mechanism is widely adopted by many state-of-the-art neural networks for computer vision, natural language processing, and machine translation, and accounts for a large portion of total execution time. We observe today's practice of implementing this mechanism using matrix-vector multiplication is suboptimal as the attention mechanism is semantically a content-based search where a large portion of computations ends up not being used. Based on this observation, we design and architect A3, which accelerates attention mechanisms in neural networks with algorithmic approximation and hardware specialization. Our proposed accelerator achieves multiple orders of magnitude improvement in energy efficiency (performance/watt) as well as substantial speedup over the state-of-the-art conventional hardware.

READ FULL TEXT

page 2

page 3

page 5

page 7

page 8

page 11

page 13

page 14

06/25/2019

A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

Neural Networks (NNs) have become the mainstream technology in the artif...
02/15/2022

HiMA: A Fast and Scalable History-based Memory Access Engine for Differentiable Neural Computer

Memory-augmented neural networks (MANNs) provide better inference perfor...
09/18/2020

Hardware Accelerator for Multi-Head Attention and Position-Wise Feed-Forward in the Transformer

Designing hardware accelerators for deep neural networks (DNNs) has been...
09/20/2022

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

Attention-based neural networks have become pervasive in many AI tasks. ...
11/21/2021

Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism

There has been a rapid advance of custom hardware (HW) for accelerating ...
07/23/2018

PCNNA: A Photonic Convolutional Neural Network Accelerator

Convolutional Neural Networks (CNN) have been the centerpiece of many ap...
12/20/2016

Exploring Different Dimensions of Attention for Uncertainty Detection

Neural networks with attention have proven effective for many natural la...