Pruning CNN's with linear filter ensembles
Despite the promising results of convolutional neural networks (CNNs), applying them on resource limited devices is still a challenge, mainly due to the huge memory and computation requirements. To tackle these problems, pruning can be applied to reduce the network size and number of floating point operations (FLOPs). Contrary to the filter norm method – that is used in network pruning and uses the assumption that the smaller this norm, the less important is the associated component –, we develop a novel filter importance norm that incorporates the loss caused by the elimination of a component from the CNN. To estimate the importance of a set of architectural components, we measure the CNN performance as different components are removed. The result is a collection of filter ensembles – filter masks – and associated performance values. We rank the filters based on a linear and additive model and remove the least important ones such that the drop in network accuracy is minimal. We evaluate our method on a fully connected network, as well as on the ResNet architecture trained on the CIFAR-10 data-set. Using our pruning method, we managed to remove 60% of the parameters and 64% of the FLOPs from the ResNet with an accuracy drop of less than 0.6%.
READ FULL TEXT