PSCNN: A 885.86 TOPS/W Programmable SRAM-based Computing-In-Memory Processor for Keyword Spotting

05/02/2022
by   Shu-Hung Kuo, et al.
0

Computing-in-memory (CIM) has attracted significant attentions in recent years due to its massive parallelism and low power consumption. However, current CIM designs suffer from large area overhead of small CIM macros and bad programmablity for model execution. This paper proposes a programmable CIM processor with a single large sized CIM macro instead of multiple smaller ones for power efficient computation and a flexible instruction set to support various binary 1-D convolution Neural Network (CNN) models in an easy way. Furthermore, the proposed architecture adopts the pooling write-back method to support fused or independent convolution/pooling operations to reduce 35.9% of latency, and the flexible ping-pong feature SRAM to fit different feature map sizes during layer-by-layer execution.The design fabricated in TSMC 28nm technology achieves 150.8 GOPS throughput and 885.86 TOPS/W power efficiency at 10 MHz when executing our binary keyword spotting model, which has higher power efficiency and flexibility than previous designs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2021

Domino: A Tailored Network-on-Chip Architecture to Enable Highly Localized Inter- and Intra-Memory DNN Computing

The ever-increasing computation complexity of fast-growing Deep Neural N...
research
11/23/2021

A Customized NoC Architecture to Enable Highly Localized Computing-On-the-Move DNN Dataflow

The ever-increasing computation complexity of fastgrowing Deep Neural Ne...
research
05/18/2019

Low-power Programmable Processor for Fast Fourier Transform Based on Transport Triggered Architecture

This paper describes a low-power processor tailored for fast Fourier tra...
research
06/10/2019

Transport Triggered Array Processor for Vision Applications

Low-level sensory data processing in many Internet-of-Things (IoT) devic...
research
04/10/2019

An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration

In recent years, neural networks have surpassed classical algorithms in ...
research
11/09/2022

LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting

This paper proposes a hardware-efficient architecture, Linearized Convol...
research
07/18/2017

On the Computation of Neumann Series

This paper proposes new factorizations for computing the Neumann series....

Please sign up or login with your details

Forgot password? Click here to reset