Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

11/17/2017
by   Xu Sun, et al.
0

We propose a simple yet effective technique to simplify the training and the resulting model of neural networks. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-k elements (in terms of magnitude) are kept. As a result, only k rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction in the computational cost. Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications. Surprisingly, experimental results demonstrate that most of time we only need to update fewer than 5 accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given. The model simplification results show that we could adaptively simplify the model which could often be reduced by around 9x, without any loss on accuracy or even with improved accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2017

meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

We propose a simple yet effective technique for neural network learning....
research
09/18/2017

Minimal Effort Back Propagation for Convolutional Neural Networks

As traditional neural network consumes a significant amount of computing...
research
08/01/2019

Accelerating CNN Training by Sparsifying Activation Gradients

Gradients to activations get involved in most of the calculations during...
research
11/18/2021

Training Neural Networks with Fixed Sparse Masks

During typical gradient-based training of deep neural networks, all of t...
research
08/20/2023

Adversarial Collaborative Filtering for Free

Collaborative Filtering (CF) has been successfully used to help users di...
research
06/09/2017

AMPS: An Augmented Matrix Formulation for Principal Submatrix Updates with Application to Power Grids

We present AMPS, an augmented matrix approach to update the solution to ...

Please sign up or login with your details

Forgot password? Click here to reset