Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30

05/15/2023
by   Francesco Conti, et al.
0

Emerging Artificial Intelligence-enabled Internet-of-Things (AI-IoT) System-on-a-Chip (SoC) for augmented reality, personalized healthcare, and nano-robotics need to run many diverse tasks within a power envelope of a few tens of mW over a wide range of operating conditions: compute-intensive but strongly quantized Deep Neural Network (DNN) inference, as well as signal processing and control requiring high-precision floating-point. We present Marsellus, an all-digital heterogeneous SoC for AI-IoT end-nodes fabricated in GlobalFoundries 22nm FDX that combines 1) a general-purpose cluster of 16 RISC-V Digital Signal Processing (DSP) cores attuned for the execution of a diverse range of workloads exploiting 4-bit and 2-bit arithmetic extensions (XpulpNN), combined with fused MAC LOAD operations and floating-point support; 2) a 2-8bit Reconfigurable Binary Engine (RBE) to accelerate 3x3 and 1x1 (pointwise) convolutions in DNNs; 3) a set of On-Chip Monitoring (OCM) blocks connected to an Adaptive Body Biasing (ABB) generator and a hardware control loop, enabling on-the-fly adaptation of transistor threshold voltages. Marsellus achieves up to 180 Gop/s or 3.32 Top/s/W on 2-bit precision arithmetic in software, and up to 637 Gop/s or 12.4 Top/s/W on hardware-accelerated DNN layers.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

page 13

page 14

research
03/31/2023

DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training

On-chip DNN inference and training at the Extreme-Edge (TinyML) impose s...
research
08/04/2021

BEANNA: A Binary-Enabled Architecture for Neural Network Acceleration

Modern hardware design trends have shifted towards specialized hardware ...
research
10/18/2021

Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode

The Internet-of-Things requires end-nodes with ultra-low-power always-on...
research
04/08/2023

BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs

Deep neural network (DNN) inference using reduced integer precision has ...
research
05/12/2023

Echoes: a 200 GOPS/W Frequency Domain SoC with FFT Processor and I2S DSP for Flexible Data Acquisition from Microphone Arrays

Emerging applications in the IoT domain require ultra-low-power and high...
research
02/10/2020

A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning

Deep Neural Networks (DNN) represent a performance-hungry application. F...
research
06/24/2022

Computational Complexity Evaluation of Neural Network Applications in Signal Processing

In this paper, we provide a systematic approach for assessing and compar...

Please sign up or login with your details

Forgot password? Click here to reset