SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization

by   Lukas Enderich, et al.

Deep neural networks (DNNs) have been proven to outperform classical methods on several machine learning benchmarks. However, they have high computational complexity and require powerful processing units. Especially when deployed on embedded systems, model size and inference time must be significantly reduced. We propose SYMOG (symmetric mixture of Gaussian modes), which significantly decreases the complexity of DNNs through low-bit fixed-point quantization. SYMOG is a novel soft quantization method such that the learning task and the quantization are solved simultaneously. During training the weight distribution changes from an unimodal Gaussian distribution to a symmetric mixture of Gaussians, where each mean value belongs to a particular fixed-point mode. We evaluate our approach with different architectures (LeNet5, VGG7, VGG11, DenseNet) on common benchmark data sets (MNIST, CIFAR-10, CIFAR-100) and we compare with state-of-the-art quantization approaches. We achieve excellent results and outperform 2-bit state-of-the-art performance with an error rate of only 5.71



There are no comments yet.


page 1

page 2

page 3

page 4


Learning Multimodal Fixed-Point Weights using Gradient Descent

Due to their high computational complexity, deep neural networks are sti...

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Neural network quantization is a promising compression technique to redu...

Data-Free Quantization through Weight Equalization and Bias Correction

We introduce a data-free quantization method for deep neural networks th...

Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm

The high computational and parameter complexity of neural networks makes...

Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework

Deep Neural Networks (DNNs) have achieved extraordinary performance in v...

Accelerating Neural Network Inference by Overflow Aware Quantization

The inherent heavy computation of deep neural networks prevents their wi...

FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism

In this paper, we present a deep reinforcement learning platform named F...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.