Generic and Universal Parallel Matrix Summation with a Flexible Compression Goal for Xilinx FPGAs

06/21/2018
by   Thomas B. Preußer, et al.
0

Bit matrix compression is a highly relevant operation in computer arithmetic. Essentially being a multi-operand addition, it is the key operation behind fast multiplication and many higher-level operations such as multiply-accumulate, the computation of the dot product or the implementation of FIR filters. Compressor implementations have been constantly evolving for greater efficiency both in general and in the context of concrete applications or specific implementation technologies. This paper is building on this history and describes a generic implementation of a bit matrix compressor for Xilinx FPGAs, which does not require a generator tool. It contributes FPGA-oriented metrics for the evaluation of elementary parallel bit counters, a systematic analysis and partial decomposition of previously proposed counters and a fully implemented construction heuristic with a flexible compression target matching the device capabilities. The generic implementation is agnostic of the aspect ratio of the input matrix and can be used for multiplication the same way as it can be for single-column population count operations.

READ FULL TEXT

page 1

page 7

research
04/05/2021

Near-Precise Parameter Approximation for Multiple Multiplications on A Single DSP Block

A multiply-accumulate (MAC) operation is the main computation unit for D...
research
04/13/2022

Fast Arbitrary Precision Floating Point on FPGA

Numerical codes that require arbitrary precision floating point (APFP) n...
research
10/25/2021

Efficiently Parallelizable Strassen-Based Multiplication of a Matrix by its Transpose

The multiplication of a matrix by its transpose, A^T A, appears as an in...
research
06/07/2023

Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs

General Matrix Multiplication (GEMM) is a fundamental operation widely u...
research
10/06/2022

Towards the Multiple Constant Multiplication at Minimal Hardware Cost

Multiple Constant Multiplication (MCM) over integers is a frequent opera...
research
10/01/2020

BCNN: A Binary CNN with All Matrix Ops Quantized to 1 Bit Precision

This paper describes a CNN where all CNN style 2D convolution operations...
research
01/31/2021

Linear Computation Coding

We introduce the new concept of computation coding. Similar to how rate-...

Please sign up or login with your details

Forgot password? Click here to reset