1 Introduction
Deep neural networks have shown stateofart performance in many realworld computer vision tasks, such as image classification
[18, 20], object detection [42, 31, 29], semantic segmentation [7] and pose estimation [51, 6]. However, the deployment of deep neural networks on edge devices is still considered a challenging task due to limitations on available memory, computational power and power consumption.Quantization [15] is a common approach to tackle this challenge with minimal performance loss, by reducing the bitwidth of network weights and activations. Quantization methods can be roughly divided into two categories: quantization aware training (QAT) and posttraining quantization (PTQ). QAT methods [24, 25, 8, 16] retrain the network in order to recover the accuracy degradation caused by quantization and usually achieve better results than PTQ methods. PTQ methods [2, 5, 35, 13] are simpler and add quantization to a given network model without any training process. These methods are usually based on a representative unlabeled dataset that is used for selecting the quantization parameters.
Recently, several works [25, 17, 48] have focused on hardware friendly quantization schemes. Namely, that their quantizers are uniform, symmetric and with poweroftwo thresholds. Such quantizers optimize computational costs as they allow integer arithmetic without any crossterms due to zeropoints and floatingpoint scaling [25].
In this work, we introduce a hardwarefriendly posttraining quantization (HPTQ) method. To the best of our knowledge, current hardware friendly quantization methods are based on quantization aware training (QAT). This might be due to the difficulty of using poweroftwo thresholds as stated in [37]. HPTQ offers a posttraining quantization flow that adapts and synergistically combines several known techniques, namely, threshold selection, shift negative correction, channel equalization, per channel quantization and bias correction.
We extensively examine the performance of our method using 8bit quantization. We evaluate HPTQ on different network architectures over a variety of tasks, including classification, object detection, semantic segmentation and pose estimation. Additionally, we provide an ablation study demonstrating the effect of each technique on the network performance. To summarize, our contributions are:

Introducing HPTQ, a method for hardware friendly posttraining quantization.

A largescale study of posttraining quantization on a variety of tasks: classification, object detection, semantic segmentation and pose estimation.

We demonstrate that competitive results can be obtained under hardware friendly constraints of uniform, symmetric 8bit quantization with poweroftwo thresholds.
2 Background and Basic Notions
In this section we give a short overview of uniform quantization and the hardware friendly constraints that will be applied in this work, namely, symmetric quantization with poweroftwo thresholds.
Uniform Affine Quantization.
A quantizer can be formalized as a right to left composition of an integer valued function and a recovering affine operation (known as dequantization). The discrete range of is called a quantization grid and if it is uniformly spaced, then is said to be a uniform quantizer.
The constant gap between two adjacent points in the quantization grid of a uniform quantizer is called its step size and the affine shift is called the zero point . Using these parameters, a uniform quantizer can be formalized as:
(1) 
where is the image of and is called the quantized integer value of .
Practically, is defined by a clipping range of real values and the number of bits for representing the quantized integer values:
(2) 
where is the step size, and is the rounding function to the nearest integer. The zeropoint is then defined as and a uniform quantizer can be formalized as:
(3) 
Note that usually the clipping boundaries are selected so that the real value 0.0 is a point on the quantization grid.
Symmetric Quantization.
Symmetric quantization is a simplified case of a uniform quantizer that restricts the zeropoint to . This eliminates the need for zeropoint shift in Eq. 1 and thus enables efficient hardware implementation of integer arithmetic without any crossterms [25].
The zeropoint restriction to 0 requires the selection of either a signed or unsigned quantization grid. Let be a clipping threshold of the quantization range. A signed quantizer is then formalized as:
(4) 
where is the stepsize. Similarly, an unsigned quantizer is formalized as:
(5) 
where is the step size.
PowerofTwo Thresholds.
A uniform, symmetric quantizer (either signed or unsigned) with a poweroftwo integer threshold is said to be a hardwarefriendly quantizer [17]. Restricting the threshold of a symmetric quantizer to poweroftwo integers (i.e. , where ) enables an efficient hardware implementation that uses integer arithmetic without floatingpoint scaling [25].
Figure 1 illustrates uniform, symmetric and hardwarefriendly 4bit quantization grids for the same range of real numbers [0.3,4.2] to be quantized. Specifically, the figure demonstrates how the symmetry and a poweroftwo threshold constraints imply suboptimal clipping ranges compared to the general uniform quantizer. These clipping ranges lead to a loss in representation bins and thus increase the potential rounding noise.
3 Method
Given a trained floating point network and a representative dataset of independent and identically distributed samples, our aim is to quantize the network in posttraining with hardwarefriendly quantizers, namely that are uniform, symmetric and with poweroftwo thresholds. Hardware Friendly Post Training Quantization (HPTQ) is a threetier method for addressing this goal. HPTQ consists of a preprocessing stage followed by activation quantization and weight quantization (see Fig. 2
). In the resulting network, activations are quantized per tensor and weights are quantized per channel.
3.1 PreProcessing
The preprocessing stage consists of folding batch normalization layers into their preceding convolution layers
[24], collecting activation statistics using the representative dataset and finally removing outliers from the collected statistics.BatchNormalization Folding.
A common technique to reduce model size and computational complexity is batchnormalization folding [24] (also known as batchnormalization fusing) in which batchnormalization layers are folded into the weights of their preceding convolution layers.
Statistics Collection.
In this stage we infer all of the samples in the representative dataset and collect activation statistics of each layer. Specifically, for each layer denote the collection of its activations over by . Based on we collect histograms for each tensor as well as the minimum, maximum and mean values per channel. In the reset of this work we assume that activation tensors have three dimensions where , and are the height, weight and number of channels, respectively.
Outlier Removal.
In this step we filter out outliers in the activation histograms using the zscore approach described in
[1]. Specifically, we remove histogram bins for which the absolute zscore value is larger than a predefined threshold. This implies that we restrict the range of each histogram bin to a predefined number of standard deviations from its activation mean value. See Figure
3 for an example. Note that since this step updates the histograms, it applies only to the Threshold Selection step (see Figure 2).3.2 Activation Quantization
This stage consists of three steps: threshold selection, shift negative correction (SNC) and activation equalization. In the threshold selection step, we set poweroftwo thresholds per tensor. The SNC step is a trick that improves the quantization of signed activation functions with a small negative range
[4]. In the activation equalization step we equalize the expected dynamic ranges of activation channels by applying a modified version of a technique that appears in [36].Threshold Selection.
Given a fixed bit width , our aim is to find a poweroftwo threshold that minimizes the noise caused by the quantization of each layer in the network. Formally, for each layer in the network, our objective is to find a threshold that minimizes
(6) 
where is the size of the representative dataset, is the collection of activation tensors in the th layer and is some error measurement.
In an ablation study we examine the effect of several possible quantization error measurements on the actual task accuracy, including Norms [38] and Kullback–Leibler (KL) divergence [34]. Our results show that Mean Square Error (MSE) [38] achieves the best performance (see Table 7). Thus, the objective of the threshold selection is to minimize
(7) 
In practice, we approximate a solution to this minimization problem by estimating the noise based on the histogram corresponding to layer collected in the Statistics Collection step above. The restriction of the threshold to poweroftwo values implies that the search space is discrete. Let be the maximal absolute value of an activation in over the representative dataset that was collected in the Statistics Collection step above and define the noclipping threshold:
(8) 
Note that the clipping noise induced by the threshold is zero and that for any poweroftwo threshold larger than , the noise is increased. Thresholds smaller than may reduce the noise, albeit, at the cost of increasing the clipping noise. Therefore, we search for a threshold minimizing the quantization error starting with and iteratively decreasing it (see. Algorithm 1).
Shift Negative Correction (SNC).
Recent works have shown benefits in using signed, nonlinear activation functions, such as Swish [40], PReLU and HSwish [21]. However, a signed symmetric quantization of these functions can be inefficient due to differences between their negative and positive dynamic ranges. The main idea in SNC is to reduce the quantization noise of an unsigned activation function with a small negative range (relatively to its positive range). This is done by adding a positive constant to the activation values (shifting its values) and using an unsigned quantizer with the same threshold. This effectively doubles the quantization grid resolution. Note that shifting the values can imply added clipping noise on the one hand but reduced rounding noise on the other.
This step can be viewed as an adaptation to PTQ of a technique that appears in [4], where activations are shifted and scaled in order to match a given dynamic range of a quantizer. Here, we do not add scaling due to its implied added complexity. Specifically, let be the activation function in some layer in the network, let be its threshold, calculated in the Threshold Selection step above and let be its minimal (negative) activation value over the representative dataset , collected in the Statistics Collection step above. If
for a hyperparameter
, then we replace with a shifted version and replace the signed quantizer with an unsigned quantizer followed by another shift operation as follows:(9) 
where is the signed quantizer, is the unsigned quantizer and is the bitwidth. In practice, the last subtraction of is folded into the following operation in the network.
Activation Equalization.
In this step, we equalize activation ranges per channel similarly to the methods presented in [36, 33]. Here, we set the scaleperchannel factor according to the value of the threshold that is selected pertensor. The motivation to use this scaling factor in order to equalize the activation ranges is to use the maximum range of the quantization bins for each channel (see Figure 4).
The authors in [36, 33] suggest to perform channel equalization by exploiting the positive scale equivariance property of activation functions. It holds for any piecewise linear activation function in its relaxed form: where is a piecewise linear function, is its modified version that fits this requirement and is a diagonal matrix with denoting the scale factor for channel .
The positive scaling equivariance can be applied on the following set of consecutive layers: a linear operation, a piecewise linear function and an additional linear operation. This is demonstrated in the following equation:
(10) 
where and are the first layer’s weights and bias, and are the second layer’s weights and bias. Although Eq. 10 demonstrates the case of fullyconnected layers, it can be also extended for CNNs where the scaling is performed per channel.
We present a use case of channel equalization named Max Channel Equalization which can be applied in any quantization scheme. We assume that
is one of the following nonlinear functions: ReLU, ReLU8 or PReLU. Given the quantization threshold
of a nonlinear function as well as the maximal activation value of the channel , where is the activation tensor of the layer, we set:(11) 
so that the maximal value of each channel in tensor will be the threshold value (see Figure 4).
3.3 Weight Quantization
In the Weight Quantization stage we quantize the network’s weights. It was shown in [26, 41] that weight quantization with scaling per channel improves accuracy. Moreover, this work presents an efficient dot product and convolution implementation supporting perchannel quantization. Our Weight Quantization stage consists of perchannel threshold selection and bias correction [36].
Threshold Selection.
As noted above, weight quantization is performed perchannel. Its thresholds are selected similarly to activation thresholds (see Algorithm 1). However, a key difference is that here the search is performed directly on the weight values, opposed to the statistical values that are used for activation. More precisely, given the weights of some channel in the network, the initial noclipping threshold is
(12) 
where are the entries of . Additionally, the error induced by a threshold is
(13) 
Note that as with activations, MSE is selected as an error measurement since it yields the best performance (see Table 10).
Bias Correction.
Quantization of weights induce bias shifts to activation means that may lead to detrimental behaviour in the following layers [36, 14]. Explicitly, let be the floating point output of a fully connected layer where are the floatingpoint input activation, weight and bias, respectively. Denote the quantized weights of the layer by and the corresponding output by . The induced bias shift can be expressed as follows:
(14) 
Several works propose approaches to correct the quantization induced bias. These include using batchnormalization statistics [36], micro training [14] and applying scale and shift per channel [3].
We adopt the solution in [36]
, in which the bias shift is fixed by modifying the layer’s bias vector
(15) 
where is the per channel empirical mean obtain in the Statistic Collection stage above. Note that although the above is written for a fully connected layer, it applies to convolutional layers as well, as shown in [36].
4 Experimental Results
In this section we evaluate the performance of HPTQ with 8bit quantization over different tasks and a variety of network architectures. The experiments are divided into two parts. The first part presents an overall performance comparison to the floating point baseline as well as to stateoftheart quantization approaches. The second part presents an ablation study that analyzes the influence of each technique in HPTQ separately.
4.1 Overall Performance Evaluation
We evaluate the performance of HPTQ on four different tasks: image classification, object detection, semantic segmentation and pose estimation. For each task, we present a comparison between the performance of models quantized by HPTQ and their floating point baselines. Furthermore, for classification and segmentation we provide a comprehensive performance comparison of HPTQ with both PTQ and QAT stateofthe art quantization methods.
We use the same set of hyperparameters for all our experiments. Specifically, the number of image samples in the representative dataset is 500. The zscore threshold in the outlier removal step is . The SNC threshold is . Last, for both activations and weights, the number of iterations performed in Algorithm 1 in the threshold selection search is set to . One should note that finetuning the hyperparameters per network may lead to further improvement. In all of the tables below is the difference between the performance of the floating point model and the quantized model, PC indicates the use of weights per channel quantization and PoT indicates poweroftwo thresholds.
Classification.
We evaluate HPTQ on the ImageNet classification task
[10] using MobileNetV1 [20] , MobileNetV2 [43] and ResNet50 [18] architectures^{1}^{1}1https://www.tensorflow.org/api_docs/python/tf/keras/applications. Tables 1, 2 and 3 present comparisons of HPTQ with other quantization methods, both PTQ and QAT, for the three architectures. The results show that HPTQ achieves competitive performance despite the hardware friendly constraints. In the tables below FAcc is the floating point accuracy and QAcc is the accuracy of the quantized model.Type  Method  PC  PoT  FAcc  QAcc  

QAT 
QT [24]  ✗  ✗  70.9  70.0  0.9 
TQT [25]  ✗  ✓  71.1  71.1  0.0  
PTQ 
SSBD [33]  ✗  ✗  70.9  69.95  0.95 
Krishnamoorthi [26]  ✓  ✗  70.9  70.3  0.6  
Wu et al [49]  ✓  ✗  71.88  70.39  1.49  
Lee et al [27]  ✗  ✗  69.5  68.84  0.66  
HPTQ (Our)  ✓  ✓  70.55  70.41  0.14 
Type  Method  PC  PoT  FAcc  QAcc  
QAT 
QT [24]  ✗  ✗  71.9  70.9  1.0 
RVQuant [39]  ✗  ✗  70.10  70.29  0.19  
TQT [25]  ✗  ✓  71.7  71.8  0.10  
PTQ 
AdaQuant [23]  ✗  ✗  73.03  73.03  0.0 
ZeroQ [5]  ✗  ✗  73.03  72.91  0.12  
SSBD [33]  ✗  ✗  71.9  71.29  0.61  
Wu et al [49]  ✓  ✗  71.88  71.14  0.74  
Krishnamoorthi [26]  ✓  ✗  71.9  69.7  2.2  
Nagel et al [37]  ✗  ✗  71.72  70.99  0.73  
✓  ✗  71.16  0.56  
DFQ [36]  ✗  ✗  71.72  70.92  0.8  
Lee et al [27]  ✗  ✗  71.23  69.5  1.73  
HPTQ (Our)  ✓  ✓  71.812  71.46  0.352 
Type  Method  PC  PoT  FAcc  QAcc  
QAT 
QT [24]  ✗  ✗  76.4  74.9  1.5 
RVQuant [39]  ✗  ✗  75.92  75.67  0.25  
HAWQV3 [50]  ✓  ✗  77.72  77.58  0.14  
LSQ [11]  ✗  ✗  76.9  76.8  0.1  
TQT [25]  ✗  ✓  76.9  76.5  0.4  
FAQ [32]  ✗  ✗  75.4  75.4  0.0  
PTQ 
ZeroQ [5]  ✗  ✗  77.72  77.67  0.05 
OCS [52]  ✗  ✗  76.1  75.9  0.2  
SSBD [33]  ✗  ✗  75.2  74.95  0.25  
He et al [19]  ✗  ✗  75.3  75.03  0.27  
Wu et al [49]  ✓  ✗  76.16  76.05  0.11  
Nagel et al [37]  ✗  ✗  76.07  75.87  0.2  
✓  ✗  75.88  0.19  
Krishnamoorthi [26]  ✗  ✗  75.2  75.00  0.20  
✓  ✗  75.1  0.1  
HPTQ (Our)  ✓  ✓  75.106  75.018  0.088 
Semantic Segmentation.
We evaluate HPTQ on Pascal VOC
[12] using DeepLab V3^{2}^{2}2https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md [7] with MobileNetV2 [43] as a backbone. Table 4 shows that HPTQ achieves competitive results compared to other PTQ methods.Type  Method  PC  PoT  FmIoU  QmIoU  

PTQ 
DFQ [36]  ✗  ✗  72.45  72.33  0.12 
Nagel et al [37]  ✗  ✗  72.94  72.44  0.50  
✓  ✗  72.27  0.67  
HPTQ (Our)  ✓  ✓  75.57  75.38  0.19 
Object Detection.
We evaluate HPTQ on COCO
[30] using the SSD detector [31] with several backbones^{3}^{3}3https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md. HPTQ achieves similar Mean Average Precision (mAP) to the floating point baseline as demonstrated in Table 5.Model  FmAP  QmAP 

SSD MobileNetV2 [43] 320x320  20.2  20.21 
SSD MobileNetV2 [43] FPN Lite 320x320  22.2  21.93 
SSD ResNet50 [18] V1 FPN 640x640  34.3  34.3 
PoseEstimation.
We evaluate HPTQ on the singleperson pose estimation task using LPN network [51] on the LIP (Look into Person) dataset [28]. We use the PCKh metric [28]
for evaluation, which is the headnormalized probability of correct keypoints. HPTQ achieves similar performance to the floating point baseline with only a slight degradation from 81.65 to 81.53 PCKh.
4.2 Ablation Study
We provide an ablation study of HPTQ’s performance on the ImageNet classification task [10] using eleven networks^{4}^{4}4https://www.tensorflow.org/api_docs/python/tf/keras/applications. The study is divided into two parts analyzing activation quantization and weight quantization.
Table 6 compares the performance of HPTQ between four cases: full floatingpoint, activation quantization, weight quantization and joint quantization of both. The comparison shows that activation quantization causes a larger degradation in performance compared to weight quantization, especially for EfficientNet with Swish activations functions. This might be due to the fact that activation equalization is not applied for these activations.
Network  FAcc 





MobileNetV1 [20]  70.558  70.48  70.394  70.418  
MobileNetV2 [43]  71.812  71.616  71.668  71.46  
NasnetMobile [53]  74.376  74.068  74.352  73.888  
VGG16 [44]  70.956  70.834  70.946  70.81  
InceptionV3 [46]  77.908  77.872  77.844  77.85  
InceptionResNetV2 [45]  80.284  80.154  80.32  80.14  
ResNet50 [18]  75.106  75.072  75.06  75.018  
EfficientNetB0 [47]  77.2  74.3  77.012  74.216  
EfficientNetB0 ReLU  77.65  77.1  77.568  77.092  
DenseNet121 [22]  74.848  73.252  74.784  73.356  
Xception [9]  79.05  79.048  79.062  78.972 
Activation Quantization Analysis.
In this analysis we evaluate the influence of the different methods used for quantizing the activations (without quantizing the weights). The analysis is performed with eleven different network architectures^{5}^{5}5EfficientNetB0 ReLU is a trained version of EfficientNetB0 with ReLU activation function instead of swish^{6}^{6}6 https://keras.io/api/applications/ on the ImageNet classification [10] task. Table 7 shows an accuracy comparison using four different threshold selection methods without applying any other of the activation quantization steps. NC indicates using the noclipping threshold. Mean Square Error (MSE), Mean Average Error (MAE) and Kullback–Leibler (KL) are three different error measurements in Equation 6.
Network  NC  MSE  MAE  KL 

MobileNetV1 [20]  70.406  70.434  60.218  70.418 
MobileNetV2 [43]  71.25  71.458  65.918  71.482 
VGG16 [44]  70.8  70.764  58.37  65.096 
ResNet50 [18]  74.612  74.996  67.896  59.556 
Table 8 shows the incremental accuracy influence on ImageNet classification [10] of the methods used by HPTQ for activation quantization (without quantizing weights). Note that SNC is applied in all of the experiments in the table and its influence is studied separately below. The table shows that all of the methods result in an improvement. Note that finetuning the zscore threshold per network may lead to further improvement.
Network Name  Baseline  +Eq.  +MSE Th.  +zscore 

MobileNetV1 [20]  70.406  70.418  70.48  70.48 
MobileNetV2 [43]  71.25  71.34  71.528  71.616 
NasnetMobile [53]  18.572  18.484  73.486  74.068 
VGG16 [44]  70.8  70.696  70.888  70.834 
InceptionV3 [46]  77.658  77.646  77.832  77.872 
InceptionResNetV2 [45]  49.132  49.238  80.014  80.154 
ResNet50 [18]  74.612  74.654  75.086  75.072 
EfficientNetB0 [47]  13.562  13.736  74.096  74.3 
EfficientNetB0 ReLU  74.298  76.298  76.956  77.1 
DenseNet121 [22]  56.08  55.916  73.28  73.252 
Xception [9]  48.718  48.784  78.87  79.048 
Table 9 shows the accuracy improvement achieved by applying Shift Negative Correction (SNC). Specifically, the table compares the performance of several versions of MobileNetV1, each with different nonlinear functions, with a full flow of activation quantization.
Weight Quantization Analysis.
In this analysis we evaluate the influence of the different methods used for quantizing weights (without quantizing activations). The analysis is performed with eleven different network architectures^{7}^{7}7EfficientNetB0 ReLU is a trained version of EfficientNetB0 with ReLU activation function instead of swish^{8}^{8}8 https://keras.io/api/applications/ on the ImageNet classification [10] task.
Table 10 shows an accuracy comparison of each quantized network using four different threshold selection methods (without applying bias correction). NC indicates using the noclipping threshold. Mean Square Error (MSE), Mean Average Error (MAE) and Kullback–Leibler (KL) are three different error measurements in Equation 6. Similarly to the results for activation quantization in Table 7, the MSE error measurement achieves the best results.
Network  NC  MSE  MAE  KL 

MobileNetV1 [20]  68.75  68.756  64.242  64.968 
MobileNetV2 [43]  69.562  69.758  67.57  62.394 
NasnetMobile [53]  74.188  74.232  72.79  73.358 
VGG16 [44]  70.944  70.94  67.486  70.472 
InceptionV3 [46]  77.768  77.82  70.91  74.28 
InceptionResNetV2 [45]  80.244  80.276  78.676  77.112 
ResNet50 [18]  75.068  75.11  72.352  73.418 
EfficientNetB0 [47]  76.822  76.822  75.86  75.554 
EfficientNetB0 ReLU  77.078  77.218  76.916  76.674 
DenseNet121 [22]  74.734  74.736  72.102  60.17 
Xception [9]  79.006  79.006  77.47  75.374 
Table 11 shows the incremental accuracy influence of the two methods (per channel quantization and bias correction) used in HPTQ for weight quantization (without quantizing activations) on the ImageNet classification task [10]. This table shows that both of our methods result in improvement.
Network  Baseline  Per ch.  +Bias corr. 

MobileNetV1 [20]  0.966  68.756  70.394 
MobileNetV2 [43]  0.398  69.758  71.668 
NasnetMobile [53]  73.494  74.232  74.352 
VGG16 [44]  70.814  70.94  70.946 
InceptionV3 [46]  76.42  77.82  77.844 
InceptionResNetV2 [45]  80.066  80.276  80.32 
ResNet50 [18]  74.718  75.11  75.06 
EfficientNetB0 [47]  2.524  76.822  77.012 
EfficientNetB0 ReLU  0.682  77.218  77.568 
DenseNet121 [22]  72.986  74.736  74.784 
Xception [9]  78.786  79.006  79.062 
5 Conclusions
In this work we propose HPTQ, a method for hardwarefriendly posttraining quantization. HPTQ offers a flow that adapts and synergistically combines several known quantization techniques both for weights and activations. We extensively evaluated the performance of HPTQ on four tasks: classification, object detection, semantic segmentation and pose estimation. Notably, for all of the tasks we demonstrated that competitive results can be obtained under our hardwarefriendly constraints of uniform and symmetric quantization with poweroftwo thresholds. In addition, we performed an ablation study in which we presented the contributions of each of the methods used by HPTQ.
References
 [1] (2015) Outlier analysis. In Data mining, pp. 237–263. Cited by: §3.1.
 [2] (2018) Posttraining 4bit quantization of convolution networks for rapiddeployment. arXiv preprint arXiv:1810.05723. Cited by: §1.
 [3] (2019) Post training 4bit quantization of convolutional networks for rapiddeployment. In Advances in Neural Information Processing Systems, Vol. 32, pp. 7950–7958. External Links: Link Cited by: §3.3.

[4]
(2020)
Lsq+: improving lowbit quantization through learnable offsets and better initialization.
In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
, pp. 696–697. Cited by: §3.2, §3.2.  [5] (2020) Zeroq: a novel zero shot quantization framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13169–13178. Cited by: §1, Table 2, Table 3.
 [6] (2019) OpenPose: realtime multiperson 2d pose estimation using part affinity fields. IEEE transactions on pattern analysis and machine intelligence 43 (1), pp. 172–186. Cited by: §1.
 [7] (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587. Cited by: §1, §4.1.
 [8] (2018) Pact: parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085. Cited by: §1.

[9]
(2017)
Xception: deep learning with depthwise separable convolutions
. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258. Cited by: Table 10, Table 11, Table 6, Table 8.  [10] (2009) Imagenet: a largescale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Cited by: §4.1, §4.2, §4.2, §4.2, §4.2, §4.2, Table 1, Table 10, Table 11, Table 2, Table 3, Table 6, Table 7, Table 8, Table 9.
 [11] (2019) Learned step size quantization. arXiv preprint arXiv:1902.08153. Cited by: Table 3.
 [12] (2010) The pascal visual object classes (voc) challenge. International journal of computer vision 88 (2), pp. 303–338. Cited by: §4.1, Table 4.
 [13] (2020) Posttraining piecewise linear quantization for deep neural networks. In European Conference on Computer Vision, pp. 69–86. Cited by: §1.
 [14] (2019) Fighting quantization bias with bias. arXiv preprint arXiv:1906.03193. Cited by: §3.3.
 [15] (2021) A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630. Cited by: §1.
 [16] (2019) Differentiable soft quantization: bridging fullprecision and lowbit neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4852–4861. Cited by: §1.
 [17] (2020) HMQ: hardware friendly mixed precision quantization block for cnns. In Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J. Frahm (Eds.), Cham, pp. 448–463. External Links: ISBN 9783030585747 Cited by: §1, §2.
 [18] (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §1, §4.1, Table 10, Table 11, Table 3, Table 5, Table 6, Table 7, Table 8.
 [19] (2018) Learning compression from limited unlabeled data. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 752–769. Cited by: Table 3.

[20]
(2017)
Mobilenets: efficient convolutional neural networks for mobile vision applications
. arXiv preprint arXiv:1704.04861. Cited by: §1, §4.1, Table 1, Table 10, Table 11, Table 6, Table 7, Table 8, Table 9.  [21] (2019) Searching for mobilenetv3. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314–1324. Cited by: §3.2.
 [22] (2017) Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708. Cited by: Table 10, Table 11, Table 6, Table 8.
 [23] (2020) Improving post training neural quantization: layerwise calibration and integer programming. arXiv preprint arXiv:2006.10518. Cited by: Table 2.
 [24] (2018) Quantization and training of neural networks for efficient integerarithmeticonly inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713. Cited by: §1, §3.1, §3.1, Table 1, Table 2, Table 3.
 [25] (2019) Trained quantization thresholds for accurate and efficient fixedpoint inference of deep neural networks. arXiv preprint arXiv:1903.08066. Cited by: §1, §1, §2, §2, Table 1, Table 2, Table 3.
 [26] (2018) Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint arXiv:1806.08342. Cited by: §3.3, Table 1, Table 2, Table 3.
 [27] (2018) Quantization for rapid deployment of deep neural networks. arXiv preprint arXiv:1810.05488. Cited by: Table 1, Table 2.
 [28] (2018) Look into person: joint body parsing & pose estimation network and a new benchmark. IEEE transactions on pattern analysis and machine intelligence 41 (4), pp. 871–885. Cited by: §4.1.
 [29] (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125. Cited by: §1.
 [30] (2014) Microsoft coco: common objects in context. In European conference on computer vision, pp. 740–755. Cited by: §4.1, Table 5.
 [31] (2016) Ssd: single shot multibox detector. In European conference on computer vision, pp. 21–37. Cited by: §1, §4.1.

[32]
(2019)
Discovering lowprecision networks close to fullprecision networks for efficient inference.
In
2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive ComputingNeurIPS Edition (EMC2NIPS)
, pp. 6–9. Cited by: Table 3.  [33] (2019) Same, same but differentrecovering neural network quantization error through weight factorization. arXiv preprint arXiv:1902.01917. Cited by: §3.2, §3.2, Table 1, Table 2, Table 3.
 [34] (2017) 8bit inference with tensorrt. External Links: Link Cited by: §3.2.
 [35] (2020) Up or down? adaptive rounding for posttraining quantization. In International Conference on Machine Learning, pp. 7197–7206. Cited by: §1.
 [36] (2019) Datafree quantization through weight equalization and bias correction. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1325–1334. Cited by: §3.2, §3.2, §3.2, §3.3, §3.3, §3.3, Table 2, Table 4.
 [37] (2021) A white paper on neural network quantization. arXiv preprint arXiv:2106.08295. Cited by: §1, Table 2, Table 3, Table 4.
 [38] (2019) Loss aware posttraining quantization. arXiv preprint arXiv:1911.07190. Cited by: §3.2.
 [39] (2018) Valueaware quantization for training and inference of neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 580–595. Cited by: Table 2, Table 3.
 [40] (2017) Searching for activation functions. arXiv preprint arXiv:1710.05941. Cited by: §3.2.
 [41] (2016) Xnornet: imagenet classification using binary convolutional neural networks. In European conference on computer vision, pp. 525–542. Cited by: §3.3.
 [42] (2015) Faster rcnn: towards realtime object detection with region proposal networks. arXiv preprint arXiv:1506.01497. Cited by: §1.
 [43] (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520. Cited by: Figure 4, §4.1, §4.1, Table 10, Table 11, Table 2, Table 4, Table 5, Table 6, Table 7, Table 8.
 [44] (2014) Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556. Cited by: Table 10, Table 11, Table 6, Table 7, Table 8.

[45]
(2017)
Inceptionv4, inceptionresnet and the impact of residual connections on learning
. InProceedings of the AAAI Conference on Artificial Intelligence
, Vol. 31. Cited by: Table 10, Table 11, Table 6, Table 8.  [46] (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826. Cited by: Table 10, Table 11, Table 6, Table 8.
 [47] (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning, pp. 6105–6114. Cited by: Table 10, Table 11, Table 6, Table 8.
 [48] (2019) Mixed precision dnns: all you need is a good parametrization. arXiv preprint arXiv:1905.11452. Cited by: §1.
 [49] (2020) Integer quantization for deep learning inference: principles and empirical evaluation. arXiv preprint arXiv:2004.09602. Cited by: Table 1, Table 2, Table 3.
 [50] (2021) HAWQv3: dyadic neural network quantization. In International Conference on Machine Learning, pp. 11875–11886. Cited by: Table 3.
 [51] (2019) Simple and lightweight human pose estimation. arXiv preprint arXiv:1911.10346. Cited by: §1, §4.1.
 [52] (2019) Improving neural network quantization without retraining using outlier channel splitting. In International conference on machine learning, pp. 7543–7552. Cited by: Table 3.
 [53] (2018) Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8697–8710. Cited by: Table 10, Table 11, Table 6, Table 8.
Comments
There are no comments yet.