VW-SDK: Efficient Convolutional Weight Mapping Using Variable Windows for Processing-In-Memory Architectures

12/21/2021
by   Johnny Rhe, et al.
7

With their high energy efficiency, processing-in-memory (PIM) arrays are increasingly used for convolutional neural network (CNN) inference. In PIM-based CNN inference, the computational latency and energy are dependent on how the CNN weights are mapped to the PIM array. A recent study proposed shifted and duplicated kernel (SDK) mapping that reuses the input feature maps with a unit of a parallel window, which is convolved with duplicated kernels to obtain multiple output elements in parallel. However, the existing SDK-based mapping algorithm does not always result in the minimum computing cycles because it only maps a square-shaped parallel window with the entire channels. In this paper, we introduce a novel mapping algorithm called variable-window SDK (VW-SDK), which adaptively determines the shape of the parallel window that leads to the minimum computing cycles for a given convolutional layer and PIM array. By allowing rectangular-shaped windows with partial channels, VW-SDK utilizes the PIM array more efficiently, thereby further reduces the number of computing cycles. The simulation with a 512x512 PIM array and Resnet-18 shows that VW-SDK improves the inference speed by 1.69x compared to the existing SDK-based algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 6

research
10/13/2020

High Area/Energy Efficiency RRAM CNN Accelerator with Kernel-Reordering Weight Mapping Scheme Based on Pattern Pruning

Resistive Random Access Memory (RRAM) is an emerging device for processi...
research
11/22/2022

ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining

Convolutional Neural Networks (CNNs) are the state-of-the-art solution f...
research
04/16/2019

Processing-In-Memory Acceleration of Convolutional Neural Networks for Energy-Efficiency, and Power-Intermittency Resilience

Herein, a bit-wise Convolutional Neural Network (CNN) in-memory accelera...
research
07/29/2023

Recent neutrino oscillation result with the IceCube experiment

The IceCube South Pole Neutrino Observatory is a Cherenkov detector inst...
research
06/08/2019

5 Parallel Prism: A topology for pipelined implementations of convolutional neural networks using computational memory

In-memory computing is an emerging computing paradigm that could enable ...
research
08/12/2023

A 9 Transistor SRAM Featuring Array-level XOR Parallelism with Secure Data Toggling Operation

Security and energy-efficiency are critical for computing applications i...
research
10/19/2018

Invocation-driven Neural Approximate Computing with a Multiclass-Classifier and Multiple Approximators

Neural approximate computing gains enormous energy-efficiency at the cos...

Please sign up or login with your details

Forgot password? Click here to reset