DEFER: Distributed Edge Inference for Deep Neural Networks

01/18/2022
by   Arjun Parthasarathy, et al.
0

Modern machine learning tools such as deep neural networks (DNNs) are playing a revolutionary role in many fields such as natural language processing, computer vision, and the internet of things. Once they are trained, deep learning models can be deployed on edge computers to perform classification and prediction on real-time data for these applications. Particularly for large models, the limited computational and memory resources on a single edge device can become the throughput bottleneck for an inference pipeline. To increase throughput and decrease per-device compute load, we present DEFER (Distributed Edge inFERence), a framework for distributed edge inference, which partitions deep neural networks into layers that can be spread across multiple compute nodes. The architecture consists of a single "dispatcher" node to distribute DNN partitions and inference data to respective compute nodes. The compute nodes are connected in a series pattern where each node's computed result is relayed to the subsequent node. The result is then returned to the Dispatcher. We quantify the throughput, energy consumption, network payload, and overhead for our framework under realistic network conditions using the CORE network emulator. We find that for the ResNet50 model, the inference throughput of DEFER with 8 compute nodes is 53 lower than single device inference. We further reduce network communication demands and energy consumption using the ZFP serialization and LZ4 compression algorithms. We have implemented DEFER in Python using the TensorFlow and Keras ML libraries, and have released DEFER as an open-source framework to benefit the research community.

READ FULL TEXT
research
10/21/2022

SEIFER: Scalable Edge Inference for Deep Neural Networks

Edge inference is becoming ever prevalent through its applications from ...
research
06/02/2021

Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels

Today's intelligent applications can achieve high performance accuracy u...
research
02/22/2023

DISCO: Distributed Inference with Sparse Communications

Deep neural networks (DNNs) have great potential to solve many real-worl...
research
04/13/2020

Enabling Incremental Knowledge Transfer for Object Detection at the Edge

Object detection using deep neural networks (DNNs) involves a huge amoun...
research
04/24/2023

Partitioning and Deployment of Deep Neural Networks on Edge Clusters

Edge inference has become more widespread, as its diverse applications r...
research
07/24/2020

Dopant Network Processing Units: Towards Efficient Neural-network Emulators with High-capacity Nanoelectronic Nodes

The rapidly growing computational demands of deep neural networks requir...
research
06/04/2019

Nemesyst: A Hybrid Parallelism Deep Learning-Based Framework Applied for Internet of Things Enabled Food Retailing Refrigeration Systems

Deep Learning has attracted considerable attention across multiple appli...

Please sign up or login with your details

Forgot password? Click here to reset