Learning Sparse Neural Networks via ℓ_0 and Tℓ_1 by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification
We study sparsification of convolutional neural networks (CNN) by a relaxed variable splitting method of ℓ_0 and transformed-ℓ_1 (Tℓ_1) penalties, with application to complex curves such as texts written in different fonts, and words written with trembling hands simulating those of Parkinson's disease patients. The CNN contains 3 convolutional layers, each followed by a maximum pooling, and finally a fully connected layer which contains the largest number of network weights. With ℓ_0 penalty, we achieved over 99 % test accuracy in distinguishing shaky vs. regular fonts or hand writings with above 86 % of the weights in the fully connected layer being zero. Comparable sparsity and test accuracy are also reached with a proper choice of Tℓ_1 penalty.
READ FULL TEXT