Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems

02/09/2018
by   Jose Nunez-Yanez, et al.
0

In this paper, we introduce a software-defined framework that enables the parallel utilization of all the programmable processing resources available in heterogeneous system-on-chip (SoC) including FPGA-based hardware accelerators and programmable CPUs. Two platforms with different architectures are considered, and a single C/C++ source code is used in both of them for the CPU and FPGA resources. Instead of simply using the hardware accelerator to offload a task from the CPU, we propose a scheduler that dynamically distributes the tasks among all the resources to fully exploit all computing devices while minimizing load unbalance. The multi-architecture study compares an ARMV7 and ARMV8 implementation with different number and type of CPU cores and also different FPGA micro-architecture and size. We measure that both platforms benefit from having the CPU cores assist FPGA execution at the same level of energy requirements.

READ FULL TEXT

page 3

page 4

research
08/20/2020

High-Performance Simultaneous Multiprocessing for Heterogeneous System-on-Chip

This paper presents a methodology for simultaneous heterogeneous computi...
research
12/15/2021

N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores

Accelerating the neural network inference by FPGA has emerged as a popul...
research
10/27/2020

hXDP: Efficient Software Packet Processing on FPGA NICs

FPGA accelerators on the NIC enable the offloading of expensive packet p...
research
03/19/2021

Enabling OpenMP Task Parallelism on Multi-FPGAs

FPGA-based hardware accelerators have received increasing attention main...
research
12/24/2019

JackHammer: Efficient Rowhammer on Heterogeneous FPGA-CPU Platforms

After years of development, FPGAs are finally making an appearance on mu...
research
04/20/2021

SME: A High Productivity FPGA Tool for Software Programmers

For several decades, the CPU has been the standard model to use in the m...
research
05/11/2018

Parallelizing Bisection Root-Finding: A Case for Accelerating Serial Algorithms in Multicore Substrates

Multicore architectures dominate today's processor market. Even though t...

Please sign up or login with your details

Forgot password? Click here to reset