A Transprecision Floating-Point Platform for Ultra-Low Power Computing

11/28/2017
by   Giuseppe Tagliavini, et al.
0

In modern low-power embedded platforms, floating-point (FP) operations emerge as a major contributor to the energy consumption of compute-intensive applications with large dynamic range. Experimental evidence shows that 50 the energy consumed by a core and its data memory is related to FP computations. The adoption of FP formats requiring a lower number of bits is an interesting opportunity to reduce energy consumption, since it allows to simplify the arithmetic circuitry and to reduce the memory bandwidth between memory and registers by enabling vectorization. From a theoretical point of view, the adoption of multiple FP types perfectly fits with the principle of transprecision computing, allowing fine-grained control of approximation while meeting specified constraints on the precision of final results. In this paper we propose an extended FP type system with complete hardware support to enable transprecision computing on low-power embedded processors, including two standard formats (binary32 and binary16) and two new formats (binary8 and binary16alt). First, we introduce a software library that enables exploration of FP types by tuning both precision and dynamic range of program variables. Then, we present a methodology to integrate our library with an external tool for precision tuning, and experimental results that highlight the clear benefits of introducing the new formats. Finally, we present the design of a transprecision FP unit capable of handling 8-bit and 16-bit operations in addition to standard 32-bit operations. Experimental results on FP-intensive benchmarks show that up to 90 8-bit or 16-bit formats. Thanks to precision tuning and vectorization, execution time is decreased by 12 average, leading to a reduction of energy consumption up to 30

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
07/03/2020

FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

The slowdown of Moore's law and the power wall necessitates a shift towa...
research
02/24/2020

Combining Learning and Optimization for Transprecision Computing

The growing demands of the worldwide IT infrastructure stress the need f...
research
07/16/2021

DNN is not all you need: Parallelizing Non-Neural ML Algorithms on Ultra-Low-Power IoT Processors

Machine Learning (ML) functions are becoming ubiquitous in latency- and ...
research
03/14/2022

Constrained Precision Tuning

Precision tuning or customized precision number representations is emerg...
research
04/26/2015

Computational Cost Reduction in Learned Transform Classifications

We present a theoretical analysis and empirical evaluations of a novel s...
research
05/06/2020

Custom-Precision Mathematical Library Explorations for Code Profiling and Optimization

The typical processors used for scientific computing have fixed-width da...
research
03/26/2018

Reactive NaN Repair for Applying Approximate Memory to Numerical Applications

Applications in the AI and HPC fields require much memory capacity, and ...

Please sign up or login with your details

Forgot password? Click here to reset