Utilizing Explainable AI for Quantization and Pruning of Deep Neural Networks

08/20/2020
by   Muhammad Sabih, et al.
0

For many applications, utilizing DNNs (Deep Neural Networks) requires their implementation on a target architecture in an optimized manner concerning energy consumption, memory requirement, throughput, etc. DNN compression is used to reduce the memory footprint and complexity of a DNN before its deployment on hardware. Recent efforts to understand and explain AI (Artificial Intelligence) methods have led to a new research area, termed as explainable AI. Explainable AI methods allow us to understand better the inner working of DNNs, such as the importance of different neurons and features. The concepts from explainable AI provide an opportunity to improve DNN compression methods such as quantization and pruning in several ways that have not been sufficiently explored so far. In this paper, we utilize explainable AI methods: mainly DeepLIFT method. We use these methods for (1) pruning of DNNs; this includes structured and unstructured pruning of CNN filters pruning as well as pruning weights of fully connected layers, (2) non-uniform quantization of DNN weights using clustering algorithm; this is also referred to as Weight Sharing, and (3) integer-based mixed-precision quantization; this is where each layer of a DNN may use a different number of integer bits. We use typical image classification datasets with common deep learning image classification models for evaluation. In all these three cases, we demonstrate significant improvements as well as new insights and opportunities from the use of explainable AI in DNN compression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/05/2018

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

Many model compression techniques of Deep Neural Networks (DNNs) have be...
research
06/15/2022

Hardening DNNs against Transfer Attacks during Network Compression using Greedy Adversarial Pruning

The prevalence and success of Deep Neural Network (DNN) applications in ...
research
05/24/2019

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks

Model compression techniques, such as pruning and quantization, are beco...
research
07/03/2019

Non-structured DNN Weight Pruning Considered Harmful

Large deep neural network (DNN) models pose the key challenge to energy ...
research
09/09/2021

ECQ^x: Explainability-Driven Quantization for Low-Bit and Sparse DNNs

The remarkable success of deep neural networks (DNNs) in various applica...
research
11/19/2021

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

This paper investigates deep neural network (DNN) compression from the p...
research
08/09/2023

SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference

Deep neural networks (DNNs) demonstrate outstanding performance across m...

Please sign up or login with your details

Forgot password? Click here to reset