DOTA: A Dynamically-Operated Photonic Tensor Core for Energy-Efficient Transformer Accelerator

by   Hanqing Zhu, et al.

The wide adoption and significant computing resource consumption of attention-based Transformers, e.g., Vision Transformer and large language models, have driven the demands for efficient hardware accelerators. While electronic accelerators have been commonly used, there is a growing interest in exploring photonics as an alternative technology due to its high energy efficiency and ultra-fast processing speed. Optical neural networks (ONNs) have demonstrated promising results for convolutional neural network (CNN) workloads that only require weight-static linear operations. However, they fail to efficiently support Transformer architectures with attention operations due to the lack of ability to process dynamic full-range tensor multiplication. In this work, we propose a customized high-performance and energy-efficient photonic Transformer accelerator, DOTA. To overcome the fundamental limitation of existing ONNs, we introduce a novel photonic tensor core, consisting of a crossbar array of interference-based optical vector dot-product engines, that supports highly-parallel, dynamic, and full-range matrix-matrix multiplication. Our comprehensive evaluation demonstrates that DOTA achieves a >4x energy and a >10x latency reduction compared to prior photonic accelerators, and delivers over 20x energy reduction and 2 to 3 orders of magnitude lower latency compared to the electronic Transformer accelerator. Our work highlights the immense potential of photonic computing for efficient hardware accelerators, particularly for advanced machine learning workloads.


page 1

page 5

page 11


ROBIN: A Robust Optical Binary Neural Network Accelerator

Domain specific neural network accelerators have garnered attention beca...

TMA: Tera-MACs/W Neural Hardware Inference Accelerator with a Multiplier-less Massive Parallel Processor

Computationally intensive Inference tasks of Deep neural networks have e...

X-Former: In-Memory Acceleration of Transformers

Transformers have achieved great success in a wide variety of natural la...

Silicon Photonic Microring Based Chip-Scale Accelerator for Delayed Feedback Reservoir Computing

To perform temporal and sequential machine learning tasks, the use of co...

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

There is a growing interest in custom spatial accelerators for machine l...

Hardware Accelerator for Multi-Head Attention and Position-Wise Feed-Forward in the Transformer

Designing hardware accelerators for deep neural networks (DNNs) has been...

Optical Transformers

The rapidly increasing size of deep-learning models has caused renewed a...

Please sign up or login with your details

Forgot password? Click here to reset