Reconfigurable Distributed FPGA Cluster Design for Deep Learning Accelerators

05/24/2023
by   Hans Johnson, et al.
0

We propose a distributed system based on lowpower embedded FPGAs designed for edge computing applications focused on exploring distributing scheduling optimizations for Deep Learning (DL) workloads to obtain the best performance regarding latency and power efficiency. Our cluster was modular throughout the experiment, and we have implementations that consist of up to 12 Zynq-7020 chip-based boards as well as 5 UltraScale+ MPSoC FPGA boards connected through an ethernet switch, and the cluster will evaluate configurable Deep Learning Accelerator (DLA) Versatile Tensor Accelerator (VTA). This adaptable distributed architecture is distinguished by its capacity to evaluate and manage neural network workloads in numerous configurations which enables users to conduct multiple experiments tailored to their specific application needs. The proposed system can simultaneously execute diverse Neural Network (NN) models, arrange the computation graph in a pipeline structure, and manually allocate greater resources to the most computationally intensive layers of the NN graph.

READ FULL TEXT

page 1

page 2

research
03/08/2021

AVEC: Accelerator Virtualization in Cloud-Edge Computing for Deep Learning Libraries

Edge computing offers the distinct advantage of harnessing compute capab...
research
07/31/2018

Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA

Neural network accelerators with low latency and low energy consumption ...
research
09/03/2020

Scalable Light-Weight Integration of FPGA Based Accelerators with Chip Multi-Processors

Modern multicore systems are migrating from homogeneous systems to heter...
research
12/15/2021

N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores

Accelerating the neural network inference by FPGA has emerged as a popul...
research
08/22/2023

Octopus: A Heterogeneous In-network Computing Accelerator Enabling Deep Learning for network

Deep learning (DL) for network models have achieved excellent performanc...
research
01/26/2020

FOS: A Modular FPGA Operating System for Dynamic Workloads

With FPGAs now being deployed in the cloud and at the edge, there is a n...
research
09/26/2019

Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator

Recurrent Neural Network (RNN) applications form a major class of AI-pow...

Please sign up or login with your details

Forgot password? Click here to reset