A transprecision floating-point cluster for efficient near-sensor data analytics

by   Fabio Montagna, et al.

Recent applications in the domain of near-sensor computing require the adoption of floating-point arithmetic to reconcile high precision results with a wide dynamic range. In this paper, we propose a multi-core computing cluster that leverages the fined-grained tunable principles of transprecision computing to provide support to near-sensor applications at a minimum power budget. Our design - based on the open-source RISC-V architecture - combines parallelization and sub-word vectorization with near-threshold operation, leading to a highly scalable and versatile system. We perform an exhaustive exploration of the design space of the transprecision cluster on a cycle-accurate FPGA emulator, with the aim to identify the most efficient configurations in terms of performance, energy efficiency, and area efficiency. We also provide a full-fledged software stack support, including a parallel runtime and a compilation toolchain, to enable the development of end-to-end applications. We perform an experimental assessment of our design on a set of benchmarks representative of the near-sensor processing domain, complementing the timing results with a post place- -route analysis of the power consumption. Finally, a comparison with the state-of-the-art shows that our solution outperforms the competitors in energy efficiency, reaching a peak of 97 Gflop/s/W on single-precision scalars and 162 Gflop/s/W on half-precision vectors.



page 5

page 7

page 9

page 11

page 14


FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

The slowdown of Moore's law and the power wall necessitates a shift towa...

Addressing Resiliency of In-Memory Floating Point Computation

In-memory computing (IMC) can eliminate the data movement between proces...

Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode

The Internet-of-Things requires end-nodes with ultra-low-power always-on...

Energy-Efficient Hybrid Stochastic-Binary Neural Networks for Near-Sensor Computing

Recent advances in neural networks (NNs) exhibit unprecedented success a...

Transport Triggered Array Processor for Vision Applications

Low-level sensory data processing in many Internet-of-Things (IoT) devic...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.