HyperTune: Dynamic Hyperparameter Tuning For Efficient Distribution of DNN Training Over Heterogeneous Systems

07/16/2020
by   Ali HeydariGorji, et al.
0

Distributed training is a novel approach to accelerate Deep Neural Networks (DNN) training, but common training libraries fall short of addressing the distributed cases with heterogeneous processors or the cases where the processing nodes get interrupted by other workloads. This paper describes distributed training of DNN on computational storage devices (CSD), which are NAND flash-based, high capacity data storage with internal processing engines. A CSD-based distributed architecture incorporates the advantages of federated learning in terms of performance scalability, resiliency, and data privacy by eliminating the unnecessary data movement between the storage device and the host processor. The paper also describes Stannis, a DNN training framework that improves on the shortcomings of existing distributed training frameworks by dynamically tuning the training hyperparameters in heterogeneous systems to maintain the maximum overall processing speed in term of processed images per second and energy efficiency. Experimental results on image classification training benchmarks show up to 3.1x improvement in performance and 2.45x reduction in energy consumption when using Stannis plus CSD compare to the generic systems.

READ FULL TEXT

page 1

page 5

page 6

research
02/17/2020

STANNIS: Low-Power Acceleration of Deep Neural Network Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of ...
research
02/17/2020

STANNIS: Low-Power Acceleration of Deep NeuralNetwork Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of ...
research
06/11/2021

Federated Learning with Spiking Neural Networks

As neural networks get widespread adoption in resource-constrained embed...
research
10/09/2017

Energy-aware Web Browsing on Heterogeneous Mobile Platforms

Web browsing is an activity that billions of mobile users perform on a d...
research
06/08/2021

Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

Tuning hyperparameters is a crucial but arduous part of the machine lear...
research
03/04/2021

Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment

With the prosperity of mobile devices, the distributed learning approach...
research
12/23/2021

In-storage Processing of I/O Intensive Applications on Computational Storage Drives

Computational storage drives (CSD) are solid-state drives (SSD) empowere...

Please sign up or login with your details

Forgot password? Click here to reset