Partitioning sparse deep neural networks for scalable training and inference

04/23/2021
by   Gunduz Vehbi Demirci, et al.
0

The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning. Both the feedforward (inference) and backpropagation steps in stochastic gradient descent (SGD) algorithm for training sparse DNNs involve consecutive sparse matrix-vector multiplications (SpMVs). We first introduce a distributed-memory parallel SpMV-based solution for the SGD algorithm to improve its scalability. The parallelization approach is based on row-wise partitioning of weight matrices that represent neuron connections between consecutive layers. We then propose a novel hypergraph model for partitioning weight matrices to reduce the total communication volume and ensure computational load-balance among processors. Experiments performed on sparse DNNs demonstrate that the proposed solution is highly efficient and scalable. By utilizing the proposed matrix partitioning scheme, the performance of our solution is further improved significantly.

READ FULL TEXT
research
12/09/2022

Scalable Graph Convolutional Network Training on Distributed-Memory Systems

Graph Convolutional Networks (GCNs) are extensively utilized for deep le...
research
09/22/2016

Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability

This paper presents a theoretical analysis and practical evaluation of t...
research
12/21/2020

Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent

Deep neural networks (DNNs) have achieved remarkable success in computer...
research
09/06/2020

TaxoNN: A Light-Weight Accelerator for Deep Neural Network Training

Emerging intelligent embedded devices rely on Deep Neural Networks (DNNs...
research
09/22/2021

Neural network relief: a pruning algorithm based on neural activity

Current deep neural networks (DNNs) are overparameterized and use most o...
research
09/16/2020

On Symmetric Rectilinear Matrix Partitioning

Even distribution of irregular workload to processing units is crucial f...
research
05/12/2022

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs

The optimization with orthogonality has been shown useful in training de...

Please sign up or login with your details

Forgot password? Click here to reset