DeepAI AI Chat
Log In Sign Up

Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge Devices

by   Nazareno Bruschi, et al.

The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized software to exploit digital signal processing (DSP) extensions of modern instruction set architectures (ISA). As such, recent research proposed optimized libraries for QNNs (from 8-bit to 2-bit) such as CMSIS-NN and PULP-NN. This work presents an extension to the PULP-NN library targeting the acceleration of mixed-precision Deep Neural Networks, an emerging paradigm able to significantly shrink the memory footprint of deep neural networks with negligible accuracy loss. The library, composed of 27 kernels, one for each permutation of input feature maps, weights, and output feature maps precision (considering 8-bit, 4-bit and 2-bit), enables efficient inference of QNN on parallel ultra-low-power (PULP) clusters of RISC-V based processors, featuring the RV32IMCXpulpV2 ISA. The proposed solution, benchmarked on an 8-cores GAP-8 PULP cluster, reaches peak performance of 16 MACs/cycle on 8 cores, performing 21x to 25x faster than an STM32H7 (powered by an ARM Cortex M7 processor) with 15x to 21x better energy efficiency.


page 1

page 2

page 3

page 4


PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors

We present PULP-NN, an optimized computing library for a parallel ultra-...

A Mixed-Precision RISC-V Processor for Extreme-Edge DNN Inference

Low bit-width Quantized Neural Networks (QNNs) enable deployment of comp...

Shifting Capsule Networks from the Cloud to the Deep Edge

Capsule networks (CapsNets) are an emerging trend in image processing. I...

Ultra-compact Binary Neural Networks for Human Activity Recognition on RISC-V Processors

Human Activity Recognition (HAR) is a relevant inference task in many mo...