5 Parallel Prism: A topology for pipelined implementations of convolutional neural networks using computational memory

06/08/2019
by   Martino Dazzi, et al.
0

In-memory computing is an emerging computing paradigm that could enable deeplearning inference at significantly higher energy efficiency and reduced latency. The essential idea is to map the synaptic weights corresponding to each layer to one or more computational memory (CM) cores. During inference, these cores perform the associated matrix-vector multiply operations in place with O(1) time complexity, thus obviating the need to move the synaptic weights to an additional processing unit. Moreover, this architecture could enable the execution of these networks in a highly pipelined fashion. However, a key challenge is to design an efficient communication fabric for the CM cores. Here, we present one such communication fabric based on a graph topology that is well suited for the widely successful convolutional neural networks (CNNs). We show that this communication fabric facilitates the pipelined execution of all state of-the-art CNNs by proving the existence of a homomorphism between one graph representation of these networks and the proposed graph topology. We then present a quantitative comparison with established communication topologies and show that our proposed topology achieves the lowest bandwidth requirements per communication channel. Finally, we present a concrete example of mapping ResNet-32 onto an array of CM cores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2022

End-to-End DNN Inference on a Massively Parallel Analog In Memory Computing Architecture

The demand for computation resources and energy efficiency of Convolutio...
research
01/04/2022

A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End Inference of Real-World Deep Neural Networks

Deployment of modern TinyML tasks on small battery-constrained IoT devic...
research
12/13/2021

Synapse Compression for Event-Based Convolutional-Neural-Network Accelerators

Manufacturing-viable neuromorphic chips require novel computer architect...
research
10/23/2020

Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural Networks

With the growing significance of graphs as an effective representation o...
research
06/15/2021

S2Engine: A Novel Systolic Architecture for Sparse Convolutional Neural Networks

Convolutional neural networks (CNNs) have achieved great success in perf...
research
12/21/2021

VW-SDK: Efficient Convolutional Weight Mapping Using Variable Windows for Processing-In-Memory Architectures

With their high energy efficiency, processing-in-memory (PIM) arrays are...
research
11/22/2022

ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining

Convolutional Neural Networks (CNNs) are the state-of-the-art solution f...

Please sign up or login with your details

Forgot password? Click here to reset