VWR2A: A Very-Wide-Register Reconfigurable-Array Architecture for Low-Power Embedded Devices

04/11/2022
by   Benoît Walter Denkinger, et al.
0

Edge-computing requires high-performance energy-efficient embedded systems. Fixed-function or custom accelerators, such as FFT or FIR filter engines, are very efficient at implementing a particular functionality for a given set of constraints. However, they are inflexible when facing application-wide optimizations or functionality upgrades. Conversely, programmable cores offer higher flexibility, but often with a penalty in area, performance, and, above all, energy consumption. In this paper, we propose VWR2A, an architecture that integrates high computational density and low power memory structures (i.e., very-wide registers and scratchpad memories). VWR2A narrows the energy gap with similar or better performance on FFT kernels with respect to an FFT accelerator. Moreover, VWR2A flexibility allows to accelerate multiple kernels, resulting in significant energy savings at the application level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2021

A Construction Kit for Efficient Low Power Neural Network Accelerator Designs

Implementing embedded neural network processing at the edge requires eff...
research
01/15/2019

Proceedings of the Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES) 2019

Proceedings of the Workshop on High Performance Energy Efficient Embedde...
research
01/10/2018

Proceedings of the Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES) 2018

Proceedings of the Workshop on High Performance Energy Efficient Embedde...
research
08/29/2019

A Machine Learning Accelerator In-Memory for Energy Harvesting

There is increasing demand to bring machine learning capabilities to low...
research
11/24/2019

A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation

The computing wall and data movement challenges of deep neural networks ...
research
10/31/2019

Direct N-body application on low-power and energy-efficient parallel architectures

The aim of this work is to quantitatively evaluate the impact of computa...
research
10/19/2019

ELSA: A Throughput-Optimized Design of an LSTM Accelerator for Energy-Constrained Devices

The next significant step in the evolution and proliferation of artifici...

Please sign up or login with your details

Forgot password? Click here to reset