1 Introduction
With the development of machine learning technologies, Deep Neural Networks (DNNs) have shown their extraordinary performance for their high accuracy and excellent scalability(Krizhevsky et al., 2012). However, DNNs are suffering from both intensive computation and huge storage. A number of prior work have focused on developing model compression techniques for DNNs. These techniques, which are applied during the training phase of the DNN, aim to simultaneously reduce the model size and accelerate the computation for inference – all these to be achieved with nonnegligible classification accuracy loss. Indeed the accuracy of a DNN inference engine after model compression is typically higher than that of a shallow neural network with no compression (Han et al., 2015; Wen et al., 2016). One of the most important categories of DNN model compression techniques is weight quantization.
We have investigated weight quantization of DNNs in many recent work (Leng et al., 2017; Park et al., 2017; Zhou et al., 2017; Lin et al., 2016; Wu et al., 2016; Rastegari et al., 2016; Hubara et al., 2016; Courbariaux et al., 2015). In these work, both storage and computational requirements of DNNs have been greatly reduced with tolerable accuracy loss. We know that multiplication operations are costly and it can be eliminated when applying binary, ternary, or powerof2 weight quantizations (Rastegari et al., 2016; Hubara et al., 2016; Courbariaux et al., 2015).
To overcome the limitation of the highly heuristic nature in prior work, a recent work
(Leng et al., 2017) developed a systematic framework of DNN weight quantization using the advanced optimization technique ADMM (Boyd et al., 2011; Hong et al., 2016). Through the adoption of ADMM, the original weight quantization problem is decomposed into two subproblems, one effectively solved using stochastic gradient descent as original DNN training, while the other solved optimally and analytically via Euclidean projection. This method achieves one of stateofart in weight quantization results. However, the direct application of ADMM technique lacks the guarantee on solution feasibility due to the nonconvex nature of objective function (loss function), while there is also margin of improvement for solution quality.
In this work, we first make the following extensions on the ADMMbased weight compression (Zhang et al., 2018): (i) develop an integrated framework of dynamic ADMM regularization and quantized weight projection, thereby guaranteeing solution feasibility and providing high solution quality; (ii) incorporate the multi updating technique for faster and better ADMM convergence.
Extensive experimental results demonstrate that the proposed progressive framework consistently outperforms prior work. Some highlights: we derive the first lossless, fully binarized (for all layers) LeNet5 for MNIST; and we derive fully binarized (for all layers) VGG16 model for CIFAR10 and ResNet model for ImageNet with reasonable accuracy loss.
2 DNN Model Compression
In this section, we give a detailed description to achieve a good quantization result for DNNs with progressive ADMM.
2.1 Framework Design
The ADMMbased weight quantization is performed multiple times, each as a step in the progressive framework. Figure 1 illustrates one step of proposed progressive DNN weight quantization framework. The quantization result from the previous step is evaluated with current quantization result, and serve as intermediate results and starting point for the subsequent step if it is better than current result. The reason to develop a progressive model compression framework is that the multistep procedure reduces the search space for weight quantization within each step.
Through extensive investigations, we conclude that the progressive comparison will be in general sufficient for weight quantization, in which each step requires approximately the same number of training epochs. And during the process, we adjust the penalty factor of ADMM to speed up the convergence and achieve better quantized result. The specific ADMM optimization process is introduced in following subsection.
2.2 ADMMbased Weight Quantization
ADMM(Boyd et al., 2011) is an advanced optimization technique which decompose an original problem into subproblems that can be solved separately and iteratively. By adopting ADMM regularized optimization, the framework can provide high solution quality and with no obvious accuracy degradation.
First, the progressive DNN weight quantization starts from a pretrained full size DNN model without compression. Consider an layer DNNs, sets of weights of the th (CONV or FC) layer are denoted by , respectively. And the loss function associated with the DNN is denoted by . In this paper, characterize the set of weights from layer to layer . The overall weight quantization problem is defined by
(1)  
subject to 
For weight quantization, elements in are the solutions of . Assume set of is the available quantized values, where denotes the number of available quantization level in layer . Suppose indicates the th quantization level in layer , which gives
(2) 
And is the scaling factor, which is initialized by the average of weight values in layer .
Then the original problem (1) can be equivalently rewritten as
(3)  
subject to 
We incorporate auxiliary variables , dual variables , then apply ADMM to decompose problem (3) into simpler subproblems. Then solve these subproblems iteratively until convergence. The augmented Lagrangian formation of problem (3) is
(4) 
The first term in problem (4) is the differentiable loss function of the DNN, and the second term is a quadratic regularization term of the , which is differentiable and convex. As a result, subproblem (4) can be solved by stochastic gradient descent algorithm (Kingma & Ba, 2014) as the original DNN training.
The standard ADMM algorithm (Boyd et al., 2011) steps proceed by repeating, for , the following subproblems iterations:
(5) 
(6) 
(7) 
3 Experimental Results
Binary Weight Quantization Results on LeNet5: To the extent of authors’ knowledge, we achieve the first lossless, fully binarized LeNet5 model in which weights all layers are binarized. The accuracy is still 99.21%, lossless compared with baseline. For example, recent work (Cheng et al., 2018) results in 2.3% accuracy degradation on MNIST for full binarization, with baseline accuracy 98.66%.
Method  Accuracy  Num. of bits 
Baseline (Cheng et al., 2018)  98.66%  32 
Binary (Cheng et al., 2018)  96.34%  1 
Our binary  99.21%  1 
Weight Quantization on CIFAR10: We also achieve fully binarized VGG16 for CIFAR10 with negligible loss in accuracy, in which weights all layers (including the first and the last) are binarized. The accuracy is 93.58%. We would like to point out that fully ternarized quantization results in 94.02% accuracy. Table 2 shows our results and comparisons.
Method  Accuracy  Num. of bits 
Baseline (Cheng et al., 2018)  84.80%  32 
8bit (Cheng et al., 2018)  84.07%  8 
Binary (Cheng et al., 2018)  81.56%  1 
Our baseline  94.70 %  32 
Our ternary  94.02%  2 
Our binary  93.58%  1 
Binary Weight Quantization Results on ResNet for ImageNet Dataset: The binarization of ResNet models on ImageNet data set is widely acknowledged as a very challenging task. As a result, there are very limited prior work (e.g., the oneshot ADMM (Leng et al., 2017)) with binarization results on ResNet models. As (Leng et al., 2017) targets ResNet18 (which is even more challenging than ResNet50 or larger ones), we make a fair comparison on the same model. Table 3 demonstrates the comparison results (Top5 accuracy loss). In prior work, it is by default that the first and last layers are not quantified (or quantized to 8 bits) as these layers have a significant effect on overall accuracy. When leaving the first and last layers unquantized, our framework is not progressive, but an extended oneshot ADMMbased framework. We can observe the higher accuracy compared with the prior method under this circumstance (first and last layers unquantized while the rest of layers binarized). The Top1 accuracy has similar result: 3.8% degradation in our extended oneshot and 4.3% in (Leng et al., 2017).
Method  Relative Top5 acc. loss  Num. of bits 
Uncompressed  0.0%  32 
Oneshot ADMM quantization (Leng et al., 2017)  2.9%  1 (32 for the first and last) 
Our method  2.5%  1 (32 for the first and last) 
Our method  5.8%  1 
Using the progressive framework, we can derive a fully binarized ResNet18, in which weights in all layers are binarized. The accuracy degradation is 5.8%, which is noticeable and shows that the full binarization of ResNet is a challenging task even under the progressive framework. We did not find prior work for comparison on this result.
4 Conclusion and Ongoing Work
In this work, we extended the prior ADMMbased framework and developed a multistep, progressive DNN weight quantization framework, in which we achieve further weight quantization results and provide better convergence rate.
Considering the good performance of our method, we plan to test more different networks for ImageNet dataset. And we are working on testing our method for different applications and datasets.
References
 Boyd et al. (2011) Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
 Cheng et al. (2018) Cheng, H.P., Huang, Y., Guo, X., Huang, Y., Yan, F., Li, H., and Chen, Y. Differentiable finegrained quantization for deep neural network compression. arXiv preprint arXiv:1810.10351, 2018.
 Courbariaux et al. (2015) Courbariaux, M., Bengio, Y., and David, J.P. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems, pp. 3123–3131, 2015.
 Han et al. (2015) Han, S., Pool, J., Tran, J., and Dally, W. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pp. 1135–1143, 2015.
 Hong et al. (2016) Hong, M., Luo, Z.Q., and Razaviyayn, M. Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM Journal on Optimization, 26(1):337–364, 2016.
 Hubara et al. (2016) Hubara, I., Courbariaux, M., Soudry, D., ElYaniv, R., and Bengio, Y. Binarized neural networks. In Advances in neural information processing systems, pp. 4107–4115, 2016.
 Kingma & Ba (2014) Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Krizhevsky et al. (2012)
Krizhevsky, A., Sutskever, I., and Hinton, G. E.
Imagenet classification with deep convolutional neural networks.
In Advances in neural information processing systems, pp. 1097–1105, 2012.  Leng et al. (2017) Leng, C., Li, H., Zhu, S., and Jin, R. Extremely low bit neural network: Squeeze the last bit out with admm. arXiv preprint arXiv:1707.09870, 2017.
 Lin et al. (2016) Lin, D., Talathi, S., and Annapureddy, S. Fixed point quantization of deep convolutional networks. In International Conference on Machine Learning, pp. 2849–2858, 2016.

Park et al. (2017)
Park, E., Ahn, J., and Yoo, S.
Weightedentropybased quantization for deep neural networks.
In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
, 2017.  Rastegari et al. (2016) Rastegari, M., Ordonez, V., Redmon, J., and Farhadi, A. Xnornet: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision, pp. 525–542. Springer, 2016.
 Wen et al. (2016) Wen, W., Wu, C., Wang, Y., Chen, Y., and Li, H. Learning structured sparsity in deep neural networks. In Advances in Neural Information Processing Systems, pp. 2074–2082, 2016.
 Wu et al. (2016) Wu, J., Leng, C., Wang, Y., Hu, Q., and Cheng, J. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4820–4828, 2016.
 Zhang et al. (2018) Zhang, T., Ye, S., Zhang, K., Tang, J., Wen, W., Fardad, M., and Wang, Y. A systematic dnn weight pruning framework using alternating direction method of multipliers. European Conference on Computer Vision (ECCV), 2018.
 Zhou et al. (2017) Zhou, A., Yao, A., Guo, Y., Xu, L., and Chen, Y. Incremental network quantization: Towards lossless cnns with lowprecision weights. arXiv preprint arXiv:1702.03044, 2017.
Comments
There are no comments yet.