SCONNA: A Stochastic Computing Based Optical Accelerator for Ultra-Fast, Energy-Efficient Inference of Integer-Quantized CNNs

02/14/2023
by   Sairam Sri Vatsavai, et al.
0

The acceleration of a CNN inference task uses convolution operations that are typically transformed into vector-dot-product (VDP) operations. Several photonic microring resonators (MRRs) based hardware architectures have been proposed to accelerate integer-quantized CNNs with remarkably higher throughput and energy efficiency compared to their electronic counterparts. However, the existing photonic MRR-based analog accelerators exhibit a very strong trade-off between the achievable input/weight precision and VDP operation size, which severely restricts their achievable VDP operation size for the quantized input/weight precision of 4 bits and higher. The restricted VDP operation size ultimately suppresses computing throughput to severely diminish the achievable performance benefits. To address this shortcoming, we for the first time present a merger of stochastic computing and MRR-based CNN accelerators. To leverage the innate precision flexibility of stochastic computing, we invent an MRR-based optical stochastic multiplier (OSM). We employ multiple OSMs in a cascaded manner using dense wavelength division multiplexing, to forge a novel Stochastic Computing based Optical Neural Network Accelerator (SCONNA). SCONNA achieves significantly high throughput and energy efficiency for accelerating inferences of high-precision quantized CNNs. Our evaluation for the inference of four modern CNNs at 8-bit input/weight precision indicates that SCONNA provides improvements of up to 66.5x, 90x, and 91x in frames-per-second (FPS), FPS/W and FPS/W/mm2, respectively, on average over two photonic MRR-based analog CNN accelerators from prior work, with Top-1 accuracy drop of only up to 0.4 transaction-level, event-driven python-based simulator for the evaluation of SCONNA and other accelerators (https://github.com/uky-UCAT/SC_ONN_SIM.git).

READ FULL TEXT

page 1

page 6

research
02/03/2023

An Optical XNOR-Bitcount Based Accelerator for Efficient Inference of Binary Neural Networks

Binary Neural Networks (BNNs) are increasingly preferred over full-preci...
research
07/12/2022

Photonic Reconfigurable Accelerators for Efficient Inference of CNNs with Mixed-Sized Tensors

Photonic Microring Resonator (MRR) based hardware accelerators have been...
research
05/26/2021

ATRIA: A Bit-Parallel Stochastic Arithmetic Based Accelerator for In-DRAM CNN Processing

With the rapidly growing use of Convolutional Neural Networks (CNNs) in ...
research
01/30/2022

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

Processing-in-memory (PIM) architectures have demonstrated great potenti...
research
11/14/2020

Channel Tiling for Improved Performance and Accuracy of Optical Neural Network Accelerators

Low latency, high throughput inference on Convolution Neural Networks (C...
research
01/31/2023

XCRYPT: Accelerating Lattice Based Cryptography with Memristor Crossbar Arrays

This paper makes a case for accelerating lattice-based post quantum cryp...
research
09/26/2022

Going Further With Winograd Convolutions: Tap-Wise Quantization for Efficient Inference on 4x4 Tile

Most of today's computer vision pipelines are built around deep neural n...

Please sign up or login with your details

Forgot password? Click here to reset