Compressed Real Numbers for AI: a case-study using a RISC-V CPU

09/11/2023
by   Federico Rossi, et al.
0

As recently demonstrated, Deep Neural Networks (DNN), usually trained using single precision IEEE 754 floating point numbers (binary32), can also work using lower precision. Therefore, 16-bit and 8-bit compressed format have attracted considerable attention. In this paper, we focused on two families of formats that have already achieved interesting results in compressing binary32 numbers in machine learning applications, without sensible degradation of the accuracy: bfloat and posit. Even if 16-bit and 8-bit bfloat/posit are routinely used for reducing the storage of the weights/biases of trained DNNs, the inference still often happens on the 32-bit FPU of the CPU (especially if GPUs are not available). In this paper we propose a way to decompress a tensor of bfloat/posits just before computations, i.e., after the compressed operands have been loaded within the vector registers of a vector capable CPU, in order to save bandwidth usage and increase cache efficiency. Finally, we show the architectural parameters and considerations under which this solution is advantageous with respect to the uncompressed one.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 8

page 9

research
12/19/2018

Training Deep Neural Networks with 8-bit Floating Point Numbers

The state-of-the-art hardware platforms for training Deep Neural Network...
research
05/18/2023

Comparative Study: Standalone IEEE 16-bit Floating-Point for Image Classification

Reducing the number of bits needed to encode the weights and activations...
research
01/05/2021

An Investigation on Inherent Robustness of Posit Data Representation

As the dimensions and operating voltages of computer electronics shrink ...
research
08/20/2017

Conversion of Mersenne Twister to double-precision floating-point numbers

The 32-bit Mersenne Twister generator MT19937 is a widely used random nu...
research
07/12/2023

Diagonally-Addressed Matrix Nicknack: How to improve SpMV performance

We suggest a technique to reduce the storage size of sparse matrices at ...
research
04/13/2018

Pieces of Eight: 8-bit Neural Machine Translation

Neural machine translation has achieved levels of fluency and adequacy t...
research
01/31/2023

Tricking AI chips into Simulating the Human Brain: A Detailed Performance Analysis

Challenging the Nvidia monopoly, dedicated AI-accelerator chips have beg...

Please sign up or login with your details

Forgot password? Click here to reset