Hardware Abstractions and Hardware Mechanisms to Support Multi-Task Execution on Coarse-Grained Reconfigurable Arrays

01/02/2023
by   Taeyoung Kong, et al.
0

Domain-specific accelerators are used in various computing systems ranging from edge devices to data centers. Coarse-grained reconfigurable arrays (CGRAs) represent an architectural midpoint between the flexibility of an FPGA and the efficiency of an ASIC and are a promising candidate for servicing multi-tasked workloads within an application domain. Unfortunately, scheduling multiple tasks onto a CGRA is challenging. CGRAs lack abstractions that capture hardware resources, leaving workload schedulers unable to reason about performance, energy, and utilization for different schedules. This work first proposes a CGRA architecture that can flexibly partition key resources, including the global buffer memory capacity, the global buffer memory bandwidth, and the compute resources. Partitioned resources serve as hardware abstractions that decouple compilation and resource allocation. The compiler uses these abstractions for coarse-grained resource mapping, and the scheduler uses them for flexible resource allocation at run time. We then propose two hardware mechanisms to support multi-task execution. A flexible-shape execution region increases the overall resource utilization by mapping multiple tasks with different resource requirements. Dynamic partial reconfiguration (DPR) enables a CGRA to update the hardware configuration as the scheduler makes decisions rapidly. We show that our abstraction can help automatic and efficient scheduling of multi-tasked workloads onto our target CGRA with high utilization, resulting in 1.05x-1.24x higher throughput and a 23-28 latency in a multi-tasked cloud workload and 60.8 autonomous system workload when compared to a baseline CGRA running single tasks at a time.

READ FULL TEXT
research
11/23/2022

Cascade: An Application Pipelining Toolkit for Coarse-Grained Reconfigurable Arrays

While coarse-grained reconfigurable arrays (CGRAs) have emerged as promi...
research
01/08/2018

Towards General Distributed Resource Selection

The advantages of distributing workloads and utilizing multiple distribu...
research
04/22/2020

Proactive Aging Mitigation in CGRAs through Utilization-Aware Allocation

Resource balancing has been effectively used to mitigate the long-term a...
research
03/11/2021

Compiler-Guided Throughput Scheduling for Many-core Machines

Modern ARM-based servers such as ThunderX and ThunderX2 offer a tremendo...
research
09/16/2023

Rewriting History: Repurposing Domain-Specific CGRAs

Coarse-grained reconfigurable arrays (CGRAs) are domain-specific devices...
research
11/08/2019

AMOEBA: A Coarse Grained Reconfigurable Architecture for Dynamic GPU Scaling

Different GPU applications exhibit varying scalability patterns with net...
research
09/29/2016

An Efficient Framework for Floor-plan Prediction of Dynamic Runtime Reconfigurable Systems

Several embedded application domains for reconfigurable systems tend to ...

Please sign up or login with your details

Forgot password? Click here to reset