High Level Synthesis Implementation of a Three-dimensional Systolic Array Architecture for Matrix Multiplications on Intel Stratix 10 FPGAs

10/17/2021
by   Paolo Gorlani, et al.
0

In this paper, we consider the HLS implementation of a three-dimensional systolic array architecture for matrix multiplication that targets specific characteristics of Intel Stratix 10 FPGAs in order to produce designs that achieve a high floating-point throughput using most of the DSPs at high frequencies in a way that avoids the congestion of the routing fabric. The investigated three-dimensional systolic array architecture is able to produce hardware designs that use 99 that let us achieve performances above 3 TFLOPS.

READ FULL TEXT
research
11/18/2019

General Matrix-Matrix Multiplication Using SIMD features of the PIII

Generalised matrix-matrix multiplication forms the kernel of many mathem...
research
03/10/2018

Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs

Large-scale floating-point matrix multiplication is a fundamental kernel...
research
11/02/2018

CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data Reuse

Deep Neural Networks (DNNs) have been widely deployed for many Machine L...
research
03/26/2019

Matrix multiplication and universal scalability of the time on the Intel Scalable processors

Matrix multiplication is one of the core operations in many areas of sci...
research
07/01/2016

Using the pyMIC Offload Module in PyFR

PyFR is an open-source high-order accurate computational fluid dynamics ...
research
06/24/2020

Lower Bounds on Rate of Convergence of Matrix Products in All Pairs Shortest Path of Social Network

With the rapid development of social network applications, social networ...
research
05/10/2021

Skew-Oblivious Data Routing for Data-Intensive Applications on FPGAs with HLS

FPGAs have become emerging computing infrastructures for accelerating ap...

Please sign up or login with your details

Forgot password? Click here to reset