MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

07/23/2023
by   Guan Shen, et al.
0

Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers, cloud platforms, and SoCs. Thus, a challenging problem arises in multi-accelerator systems: selecting a proper combination of accelerators from available designs and searching for efficient DNN mapping strategies. To this end, we propose MARS, a novel mapping framework that can perform computation-aware accelerator selection, and apply communication-aware sharding strategies to maximize parallelism. Experimental results show that MARS can achieve 32.2 to the baseline, and 59.4 to the corresponding state-of-the-art method.

READ FULL TEXT
research
10/26/2022

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-Tenant DNN Accelerators

To meet the ever-increasing computation demand from emerging workloads, ...
research
07/02/2019

Accelerator-level Parallelism

Future applications demand more performance, but technology advances hav...
research
10/20/2021

Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning

We present a novel characterization of the mapping of multiple paralleli...
research
10/07/2021

MAPA: Multi-Accelerator Pattern Allocation Policy for Multi-Tenant GPU Servers

Multi-accelerator servers are increasingly being deployed in shared mult...
research
08/25/2021

Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation

Deep neural networks (DNN) have shown superior performance in a variety ...
research
01/11/2023

TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks

TAPS is a Topology-Aware intra-operator Parallelism strategy Searching a...
research
12/06/2022

Integration of a systolic array based hardware accelerator into a DNN operator auto-tuning framework

The deployment of neural networks on heterogeneous SoCs coupled with cus...

Please sign up or login with your details

Forgot password? Click here to reset