Sparq: A Custom RISC-V Vector Processor for Efficient Sub-Byte Quantized Inference

06/16/2023
by   Theo Dupuis, et al.
0

Convolutional Neural Networks (CNNs) are used in a wide range of applications, with full-precision CNNs achieving high accuracy at the expense of portability. Recent progress in quantization techniques has demonstrated that sub-byte Quantized Neural Networks (QNNs) achieve comparable or superior accuracy while significantly reducing the computational cost and memory footprint. However, sub-byte computation on commodity hardware is sub-optimal due to the lack of support for such precision. In this paper, we introduce Sparq, a Sub-byte vector Processor designed for the AcceleRation of QNN inference. This processor is based on a modified version of Ara, an open-source 64-bit RISC-V “V” compliant processor. Sparq is implemented in GLOBAL FOUNDRIES 22FDX FD-SOI technology and extends the Instruction Set Architecture (ISA) by adding a new multiply-shift-accumulate instruction to improve sub-byte computation effciency. The floating-point unit is also removed to minimize area and power usage. To demonstrate Sparq performance, we implement an ultra-low-precision (1-bit to 4-bit) vectorized conv2d operation taking advantage of the dedicated hardware. We show that Sparq can significantly accelerate sub-byte computations with respectively 3.2 times, and 1.7 times acceleration over an optimized 16-bit 2D convolution for 2-bit and 4-bit quantization.

READ FULL TEXT

page 1

page 4

research
02/12/2023

Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

In this paper, we present Quark, an integer RISC-V vector processor spec...
research
10/10/2018

Adding 32-bit Mode to the ACL2 Model of the x86 ISA

The ACL2 model of the x86 Instruction Set Architecture was built for the...
research
04/18/2023

DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables

A lot of recent progress has been made in ultra low-bit quantization, pr...
research
10/08/2020

A Mixed-Precision RISC-V Processor for Extreme-Edge DNN Inference

Low bit-width Quantized Neural Networks (QNNs) enable deployment of comp...
research
06/12/2018

Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs

CNNs have been shown to maintain reasonable classification accuracy when...
research
04/10/2019

An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration

In recent years, neural networks have surpassed classical algorithms in ...
research
01/23/2018

Stacked Filters Stationary Flow For Hardware-Oriented Acceleration Of Deep Convolutional Neural Networks

To address memory and computation resource limitations for hardware-orie...

Please sign up or login with your details

Forgot password? Click here to reset