Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput

10/21/2022
by   Arjun Parthasarathy, et al.
0

Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. We present an algorithm which partitions DNNs and distributes them across a set of edge devices with the goal of minimizing the bottleneck latency and therefore maximizing inference throughput. The system scales well to systems of different node memory capacities and numbers of nodes. We find that we can reduce the bottleneck latency by 10x over a random algorithm and 35 partitioning-placement algorithm. Furthermore we find empirically that for the set of representative models we tested, the algorithm produces results within 9.2

READ FULL TEXT

page 1

page 7

research
04/24/2023

Partitioning and Deployment of Deep Neural Networks on Edge Clusters

Edge inference has become more widespread, as its diverse applications r...
research
10/21/2022

SEIFER: Scalable Edge Inference for Deep Neural Networks

Edge inference is becoming ever prevalent through its applications from ...
research
02/11/2018

Edge-Host Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms

This paper introduces partitioning an inference task of a deep neural ne...
research
08/19/2020

A Computational-Graph Partitioning Method for Training Memory-Constrained DNNs

We propose ParDNN, an automatic, generic, and non-intrusive partitioning...
research
09/03/2019

Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Partitioning and distributing deep neural networks (DNNs) over physical ...
research
05/21/2022

SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments

In recent years, deep learning models have become ubiquitous in industry...
research
08/26/2023

Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?

Deployment of real-time ML services on warehouse-scale infrastructures i...

Please sign up or login with your details

Forgot password? Click here to reset