Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

06/02/2016
by   Shizhao Sun, et al.
0

Parallelization framework has become a necessity to speed up the training of deep neural networks (DNN) recently. Such framework typically employs the Model Average approach, denoted as MA-DNN, in which parallel workers conduct respective training based on their own local data while the parameters of local models are periodically communicated and averaged to obtain a global model which serves as the new start of local models. However, since DNN is a highly non-convex model, averaging parameters cannot ensure that such global model can perform better than those local models. To tackle this problem, we introduce a new parallel training framework called Ensemble-Compression, denoted as EC-DNN. In this framework, we propose to aggregate the local models by ensemble, i.e., averaging the outputs of local models instead of the parameters. As most of prevalent loss functions are convex to the output of DNN, the performance of ensemble-based global model is guaranteed to be at least as good as the average performance of local models. However, a big challenge lies in the explosion of model size since each round of ensemble can give rise to multiple times size increment. Thus, we carry out model compression after each ensemble, specialized by a distillation based method in this paper, to reduce the size of the global model to be the same as the local ones. Our experimental results demonstrate the prominent advantage of EC-DNN over MA-DNN in terms of both accuracy and speedup.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2015

Experiments on Parallel Training of Deep Neural Network using Model Averaging

In this work we apply model averaging to parallel training of deep neura...
research
11/28/2017

Homomorphic Parameter Compression for Distributed Deep Learning Training

Distributed training of deep neural networks has received significant re...
research
08/28/2021

DKM: Differentiable K-Means Clustering Layer for Neural Network Compression

Deep neural network (DNN) model compression for efficient on-device infe...
research
01/18/2023

HCE: Improving Performance and Efficiency with Heterogeneously Compressed Neural Network Ensemble

Ensemble learning has gain attention in resent deep learning research as...
research
01/13/2022

Parallel Neural Local Lossless Compression

The recently proposed Neural Local Lossless Compression (NeLLoC), which ...
research
07/13/2023

Layerwise Linear Mode Connectivity

In the federated setup one performs an aggregation of separate local mod...
research
02/13/2023

A Domain Decomposition-Based CNN-DNN Architecture for Model Parallel Training Applied to Image Recognition Problems

Deep neural networks (DNNs) and, in particular, convolutional neural net...

Please sign up or login with your details

Forgot password? Click here to reset