Efficient Floating-Point Givens Rotation Unit

10/23/2020

∙

High-throughput QR decomposition is a key operation in many advanced signal processing and communication applications. For some of these applications, using floating-point computation is becoming almost compulsory. However, there are scarce works in hardware implementations of floating-point QR decomposition for embedded systems. In this paper, we propose a very efficient high-throughput floating-point Givens rotation unit for QR decomposition. Moreover, the initial proposed design for conventional number formats is enhanced by using the new Half-Unit Biased format. The provided error analysis shows the effectiveness of our proposals and the trade-off of different implementation parameters. FPGA implementation results are also presented and a thorough comparison between both approaches. These implementation results also reveal outstanding improvements compared to other previous similar designs in terms of area, latency, and throughput.

READ FULL TEXT

Efficient Floating-Point Givens Rotation Unit

An efficient floating point multiplier design for high speed applications using Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm

A floating point division unit based on Taylor-Series expansion algorithm and Iterative Logarithmic Multiplier

Efficient Non-linear Calculators

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Proposal of a Takagi-Sugeno Fuzzy-PI Controller Hardware

PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications

A Characterization of the SPARC T3-4 System

Efficient Floating-Point Givens Rotation Unit

Related Research

An efficient floating point multiplier design for high speed applications using Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm

A floating point division unit based on Taylor-Series expansion algorithm and Iterative Logarithmic Multiplier

Efficient Non-linear Calculators

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Proposal of a Takagi-Sugeno Fuzzy-PI Controller Hardware

PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications

A Characterization of the SPARC T3-4 System