
-
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
This work presents DONNA (Distilling Optimal Neural Network Architecture...
read it
-
Differentiable Joint Pruning and Quantization for Hardware Efficiency
We present a differentiable joint pruning and quantization (DJPQ) scheme...
read it
-
Bayesian Bits: Unifying Quantization and Pruning
We introduce Bayesian Bits, a practical method for joint mixed precision...
read it
-
Up or Down? Adaptive Rounding for Post-Training Quantization
When quantizing neural networks, assigning each floating-point weight to...
read it
-
LSQ+: Improving low-bit quantization through learnable offsets and better initialization
Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that...
read it
-
Conditional Channel Gated Networks for Task-Aware Continual Learning
Convolutional Neural Networks experience catastrophic forgetting when op...
read it
-
Learned Threshold Pruning
This paper presents a novel differentiable method for unstructured weigh...
read it
-
Gradient ℓ_1 Regularization for Quantization Robustness
We analyze the effect of quantizing weights and activations of neural ne...
read it
-
Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
The success of deep neural networks in many real-world applications is l...
read it
-
Batch-Shaped Channel Gated Networks
We present a method for gating deep-learning architectures on a fine-gra...
read it
-
Data-Free Quantization through Weight Equalization and Bias Correction
We introduce a data-free quantization method for deep neural networks th...
read it
-
Relaxed Quantization for Discretized Neural Networks
Neural network quantization has become an important research area due to...
read it