Design optimization for high-performance computing using FPGA

04/24/2023
by   Murat Isik, et al.
0

Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimize Tensil AI's open-source inference accelerator for maximum performance using ResNet20 trained on CIFAR in this paper in order to gain insight into the use of FPGAs for high-performance computing. In this paper, we show how improving hardware design, using Xilinx Ultra RAM, and using advanced compiler strategies can lead to improved inference performance. We also demonstrate that running the CIFAR test data set shows very little accuracy drop when rounding down from the original 32-bit floating point. The heterogeneous computing model in our platform allows us to achieve a frame rate of 293.58 frames per second (FPS) and a using CIFAR. The experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.21 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency.

READ FULL TEXT

page 8

page 12

page 13

page 15

research
01/17/2023

An Energy-Efficient Reconfigurable Autoencoder Implementation on FPGA

Autoencoders are unsupervised neural networks that are used to process a...
research
05/01/2019

Full-stack Optimization for Accelerating CNNs with FPGA Validation

We present a full-stack optimization framework for accelerating inferenc...
research
01/05/2023

FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization

Spiking neural networks (SNNs) have been widely used due to their strong...
research
03/28/2018

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs

Both industry and academia have extensively investigated hardware accele...
research
08/27/2021

A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays

The impending termination of Moore's law motivates the search for new fo...
research
10/26/2020

High-Performance Spectral Element Methods on Field-Programmable Gate Arrays

Improvements in computer systems have historically relied on two well-kn...
research
05/08/2018

FlashAbacus: A Self-Governing Flash-Based Accelerator for Low-Power Systems

Energy efficiency and computing flexibility are some of the primary desi...

Please sign up or login with your details

Forgot password? Click here to reset