A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation

11/24/2019
by   Geng Yuan, et al.
0

The computing wall and data movement challenges of deep neural networks (DNNs) have exposed the limitations of conventional CMOS-based DNN accelerators. Furthermore, the deep structure and large model size will make DNNs prohibitive to embedded systems and IoT devices, where low power consumption are required. To address these challenges, spin orbit torque magnetic random-access memory (SOT-MRAM) and SOT-MRAM based Processing-In-Memory (PIM) engines have been used to reduce the power consumption of DNNs since SOT-MRAM has the characteristic of near-zero standby power, high density, none-volatile. However, the drawbacks of SOT-MRAM based PIM engines such as high writing latency and requiring low bit-width data decrease its popularity as a favorable energy efficient DNN accelerator. To mitigate these drawbacks, we propose an ultra energy efficient framework by using model compression techniques including weight pruning and quantization from the software level considering the architecture of SOT-MRAM PIM. And we incorporate the alternating direction method of multipliers (ADMM) into the training phase to further guarantee the solution feasibility and satisfy SOT-MRAM hardware constraints. Thus, the footprint and power consumption of SOT-MRAM PIM can be reduced, while increasing the overall system throughput at the meantime, making our proposed ADMM-based SOT-MRAM PIM more energy efficiency and suitable for embedded systems or IoT devices. Our experimental results show the accuracy and compression rate of our proposed framework is consistently outperforming the reference works, while the efficiency (area & power) and throughput of SOT-MRAM PIM engine is significantly improved.

READ FULL TEXT
research
08/29/2019

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM

The high computation and memory storage of large deep neural networks (D...
research
01/12/2021

Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices

Sound event detection (SED) is a hot topic in consumer and smart city ap...
research
05/07/2020

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

We present SmartExchange, an algorithm-hardware co-design framework to t...
research
05/03/2021

Hardware Implementation of an OPC UA Server for Industrial Field Devices

Industrial plants suffer from a high degree of complexity and incompatib...
research
12/17/2020

FantastIC4: A Hardware-Software Co-Design Approach for Efficiently Running 4bit-Compact Multilayer Perceptrons

With the growing demand for deploying deep learning models to the "edge"...
research
04/11/2022

VWR2A: A Very-Wide-Register Reconfigurable-Array Architecture for Low-Power Embedded Devices

Edge-computing requires high-performance energy-efficient embedded syste...
research
10/17/2015

Memory-Efficient Design Strategy for a Parallel Embedded Integral Image Computation Engine

In embedded vision systems, parallel computation of the integral image p...

Please sign up or login with your details

Forgot password? Click here to reset