I Introduction
Deep Learning Networks (DLNs) have exhibited better than human performance in several visionrelated tasks [1]. However, they have been recently shown to be vulnerable toward adversarial attacks [2, 3, 4]: slight changes of input pixel intensities that fool a DLN to misclassify an input with high confidence (Fig. 1). What is more worrying is that such small changes (that craft adversaries) are visually imperceptible to humans, yet, mislead a DLN. This vulnerability severely limits the potential safeuse and deployment of DLNs in realworld scenarios. For instance, an attacker can fool a DLN deployed on a selfdriving car to mispredict a STOP sign as a GO signal, and cause fatal accidents.
Subsequently, there have been several theories pertaining to the adversarial susceptibility of DLNs [3]
. The most common one suggests that the presence of adversary is an outcome of the excessive linearity of a DLN (a property of highdimensional dotproducts). While one can argue that ReLUtype activation imposes nonlinearity in a model, the linear operations such as Convolution, Pooling exceed the number of nonlinear ReLU operations. Further, ReLU is typically a linear functionality in the
regime, and hence, plagues a DLN to be sufficiently linear. Now, this linearity causes a model to extrapolate its behavior for points in the hyperspace (of data and model parameters) that lie outside the training/test data manifold. Adversarial inputs, essentially, are images that are synthesized such that they ‘lie far’ from the typical data manifold and hence get misclassified.Fig. 2 (a) demonstrates this data manifold intuition and adversarial input creation with a cartoon. Since DLNs are discriminative models, they partition a very highdimensional input space into different classes by learning appropriate decision boundaries. The classspecific decision boundaries simply divide the space into hypervolumes. Interestingly, these hypervolumes encompass the training data examples as well as large areas of unpopulated space that is arbitrary and untrained. This extrapolation of decision boundary beyond the training data space is a result of ‘linearity’, that in turn, gives rise to generalization ability. The fact that a model trained only on training data is able to predict well on unseen test data (termed as, generalization) is a favorable outcome of this extrapolation property. Unfortunately, this property also exposes a model to adversarial attacks. Adversarial data are created by simply adding small perturbations to an input data point, that shifts it from its manifold (or hypervolume) to a different hypervolume (that the model has not been trained upon and shows extrapolated behavior), causing misclassification.
From the above intuition, one can deduce that adding regularization features to a DLN’s training will improve its generalization ability and in turn, adversarial robustness. In fact, the most effective form of adversarial defense so far is training a model with adversarial data augmentation (called adversarial training) [5]. It is evident that explicit training on adversarial data will increase the model’s capability to generalize and hence predict correctly on unseen adversarial data. However, the above discussion on excessive linearity and hyperspace dimensionality points to an alternate and unexplored regularization possibility, that is discretizing or constraining the data manifold for achieving adversarial robustness. For instance, discretizing the input data (say from 256pixel value levels (or 8bit) to 4 levels (or 2bit)) reduces the regions into which data can be perturbed. In other words, the minimum perturbation required to shift a particular data point from one hypervolume to another will increase in a discretized space (Fig. 2 (b)). This in turn will intrinsically improve the resistance of a DLN. Similarly, discretizing the parameter space (as in binarized neural networks (BNNs) [6]
) will introduce discontinuities and quantization in the manifold (that is nonlinear by nature). This will further decrease the extent of hypervolume space that is arbitrary/untrained and thus reduce adversarial susceptibility (Fig. 2(b)). It is evident that such discretization methods have an added advantage of computational efficiency. In fact, lowprecision neural networks (BNNs and related XNORNets
[7]) were introduced with a key motif of obtaining reduced memory and powerconsumption for hardware deployment of DLNs.In this paper, we demonstrate that discretization, besides offering obvious efficiency improvements, has farreaching implication on a model’s adversarial resistance. We particularly emphasize on three different discretization themes and illustrate their suitability toward improving a DLN’s adversarial robustness, as follows:

Discretization of input space: We reduce the input dimensionality by quantizing the RGB pixel intensities into a variable range: to . We show that for minimal loss in accuracy, the adversarial robustness of a model substantially improves ( accuracy difference between clean test and adversarial test data), even, without any adversarial training. Furthermore, we show that combining adversarial training with 2bit input discretization makes a model substantially more robust (than adversarial training with full 8bit input precision) for large perturbation ranges.

Discretization of parameter space:
We show that models trained with lowprecision weights and activations, such as BNNs, are intrinsically more robust to adversarial perturbations than full precision networks. Furthermore, we find that training BNNs with adversarial data augmentation is difficult. However, increasing the capacity of the BNN (with more neurons and weights) minimizes the adversarial training difficulty. For sufficient model capacity, adversarially trained BNNs yield higher adversarial robustness than their fullprecision counterpart.

Discretization of both input & parameter space: We demonstrate that combining input discretization with binarized weight /activation training greatly improves a model’s robustness. In fact, training a BNN with input discretization (say, 2bit input) yields similar or better adversarial accuracy as that of an adversarially trained fullprecision model. Thus, the combined discretization scheme can be seen as an efficient alternative to achieving adversarial robustness without the expensive data augmentation procedure.
Ii Related Work
Based on the intuition demonstrated in Fig. 2, robustness of DLNs can be attributed to two factors: property of the input and model property. Consequently, there have been many recent proposals [8, 9, 10, 11] that exploit the inputdependent factor and try to remove adversarial perturbations by applying input preprocessing or transformations. Due to the simplicity of this approach, these methods are attractive for practical implementations as they do not incur large computational overhead (as with adversarial training) and do not interfere with the learning process. Our paper complements the results of the prior works while presenting a novel result on the effectiveness of combined parameter and input discretization for adversarial robustness.
One of the first works on input discretization by Xu et al. [11] propose a depthcolorsqueezing
technique wherein they reduce the degrees of freedom available to an adversary by ‘removing’ unnecessary features. Our pixel discretization scheme is based on their color depth reduction technique. However, the key idea in
[11] is to compare the model’s prediction on the original input with its prediction on the squeezed input during testing. Xu et al. train the model with regular inputs and during inference use pixel discretization to detect adversarial inputs. That is, if the original and squeezed inputs produce predictions with large difference (greater than an userdefined threshold), the model deems the input to be adversarial and rejects it. Ultimately, the model outputs prediction for only legitimate or nonadversarial inputs. In contrast, the key novel aspect of our work is to train a model with discretized pixel data such that the model looks at a reduced input subspace during training that decreases or constrains its’ ability to overly genralize in the highdimensional subspace. Similarly, the thermometer encoding technique and input transformation technique proposed in [9], [8] are guided by the same intuition of reducing adversarial subspace dimensions. Guo et al. [8] trained the network with images transformed in various ways and observed improved adversarial resistance. However, they measured robustness for controlled graybox attack settings (where, the model parameters are known to the attacker but the input transformations are unknown). Our paper’s results on whitebox attack is a stronger notion of robustness as we assume all parameters as well as input discretization known to the attacker. Thus, our results while supporting the claims of [8] are more substantial and generalizable. In Buckman et al. [9], the authors propose a thermometer encoding technique to map input pixels to a binary vector in order to make more meaningful change during pixel discretization without losing any information from the original image. While, the authors show good adversarial robustness results on small tasks, such as, MNIST
[12], they are shown to achieve poor performance on more complex datasets (like, CIFAR10 [13]) [10].In order to address the limitation of the prior works solely based on input space modifications, we investigate the effect of combining model discretization with the input transformation property thereby leveraging both criteria that contribute to adversarial dimensions. To the best of our knowledge, we are the first to formally evaluate and analyze the impact of input and parameter space discretization for DLNs (across simple and complex datasets including ImageNet2012 dataset) on robustness. A recent work [14] demonstrated the effectiveness of BNNs against adversarial attacks and observed a similar difficulty in adversarial training with BNNs. However, they did not consider input space discretization and its impact on robustness. While complementing their results, we show that quantizing the input pixels of a BNN during training greatly improves its robustness, even waiving the need for the expensive and timeconsuming adversarial training, for certain perturbation ranges.
Iii Background on Adversarial
Attacks
Generating Adversaries : Adversarial examples are created using a trained DLN’s parameters and gradients. As shown in Fig. 1, the adversarial perturbation, , is not just some random noise, but carefully designed to bias the network’s prediction on a given input towards a wrong class. Goodfellow et. al [3]
proposed a simple method called Fast Gradient Sign Method (FGSM) to craft adversarial examples by linearizing a trained model’s loss function (
, say crossentropy) with respect to the input ():(1) 
Here, is the true class label for the input , denotes the model parameters (weights, biases etc.) and quantifies the magnitude of distortion. The net perturbation added to the input () is, thus, regulated by . Distorting the input image in the direction of steepest gradient has the maximal effect on the loss function during prediction. Intuitively, referring to Fig.2, this distortion shifts the data point from the trained region or hypervolume to an arbitrary region thereby fooling the model.
Types of Attacks : In machine learning literature, there are two kinds of attacks: BlackBox (BB), WhiteBox (WB) that are used to study adversarial robustness [15]. WB adversaries are created using the target model’s parameters, that is, the attacker has full knowledge of a target model’s training information. BB attacks refer to the case when the attacker has no knowledge about the target model’s parameters. In this case, adversaries are created using a different source model’s parameters trained on the same classification task as the target model. Since BB attacks are transferred onto the target model, they are weaker than WB attacks. Security against WB attacks is a stronger notion and robustness against WB attacks guarantees robustness against BB for similar perturbation () range.
Adversarial Training : Adversarial training simply injects adversarial examples into the training dataset of a model [5]. For each training sample in the dataset, an adversary is created using FGSM [3]. There are several forms of adversarial training. For instance, instead of using the same for all training examples, [16, 15] propose to sample a unique
(from a random normal distribution) for each training example. This will increase the variation in the adversaries created thereby increasing the robustness of a network to larger range of
values. The authors in [17] use WB adversaries created, using a multistep variant of FGSM to guarantee a strong defense against both BB and WB attacks. Note, the common theme across all adversarial training methods is data augmentation.In this work, we focus on adversarial attacks created using FGSM and evaluate the robustness of models against WB adversaries. We evaluate a model’s robustness/report adversarial accuracy on the adversarial dataset created using the test data for a given task.
Iv Experiments
We conduct a series of experiments for each discretization theme, primarily using MNIST [12] (Fully Connected Network, FCN) and CIFAR10 [13] (AlexNet [18] architecture), detailing the advantages and limitation of each approach. We compare the adversarial robustness of each discretization approach with its’ fullprecision counterpart (with and without adversarial training), using values reported in recent works [15, 17]. For adversarial training, we employ Randomstep FGSM (RFGSM) proposed in [15] to create a variety of training set adversaries. RFGSM perturbs the input with a small random step (sampled from a normal distribution ) before adding the loss gradient to the input: , where . We use WB adversarial training to confer strong robustness toward all forms of attacks. Note, for MNIST (CIFAR), we use during adversarial training. For evaluating the robustness of parameter space discretization, we use BNNs [6, 18] to evaluate CIFAR10 and MNIST datasets. We also evaluate the robustness of discretization methods on largescale datasets, CIFAR100 (ResNet20 architecture [19]) and Imagenet [20] (AlexNet architecture) using XNOR networks [7, 21]. Please note, for MNIST we use two different FCN architectures: FCN14 hidden layer network with 6144 neurons each (7846144(4)10), FCN24 hidden layer network with 600 neurons each (784600(4)10). We imported github models [21, 18]
for implementing our experiments. We used the same hyperparameters (such as weight decay value, learning rate etc.) as used in
[21, 18] to train our models. It is worth mentioning that our paper is the only other work besides [14] demonstrating the effectiveness of discretized/binarized parameter space on adversarial attacks. While [14] conducted experiments with various forms of attacks (primarily, on MNIST), we restrict ourselves to the WB attack scenario and extrapolate our analysis on larger datasets.Iva Discretization of Input Space
With input space discretization, we convert the raw integer pixel intensities () that are typically 8bit (or ) values to a low precision range of bits () as:
(2) 
where denotes integer division. Such quantization reduces the number of data points (given a grayscale input image of size ) in the manifold from to . This can be broadly interpreted as reducing the redundancy in the input data. Fig. 3 illustrates sample CIFAR10 images discretized to varying
values. The corresponding accuracy (trained on AlexNet for 20 epochs) is shown in Table
I. There is a natural tradeoff between input discretization and overall accuracy of a network. Yet, the test accuracy loss from the full precision to is . This verifies the presence of unnecessary and redundant input dimensions that do not substantially contribute to the classification task or accuracy. discretization decreases the accuracy by a larger margin (). Note, this accuracy loss can be minimized by training the inputs for more epochs. However, for isocomparison, we fix the number of epochs across all experiments for a given dataset.Inputbit  Accuracy 
2b  82 
3b  86.64 
4b  87.1 
8b  88.9 
A remarkable outcome of this discretization method is the substantial improvement in a model’s adversarial accuracy. Fig. 4 illustrates the evolution of adversarial accuracy of the CIFAR10 models (from Table I) with increasing level of perturbation, . Note, corresponds to clean test accuracy. It is clear that clamping the input dimensionality to lower values increases the resistance of the model to larger magnitude of distortion. We can thus deduce that removing the redundant dimensions in the input reduces the overall hypervolume space thereby leaving ‘less’ space for shifting or adversarially perturbing a data point (referring to Fig.2 intuition). input model shows a decline in accuracy even for a small value of . This further confirms that presence of redundant input dimensions allow even small perturbations to shift a data point. In contrast, increasing discretization increases the minimum that affects a model’s accuracy catastrophically. What is surprising is that for input, a model’s adversarial accuracy () for large is almost similar to that of clean accuracy (). For larger , the accuracy of all models declines to , except . This is a very interesting result since we have not employed any adversarial training, and still achieve substantial adversarial resistance for a large range.
Fig. 4(b) shows the adversarial accuracy results for MNIST (trained on FCN2 for 10 epochs). We observe a similar trend of increasing adversarial resistance with increasing discretization for larger . Since MNIST is a simple dataset with predominantly blackbackground, input discretization () does not contribute much to adversarial resistance until we go to extremely low precision. In fact, discretization yields adversarial accuracy similar to the clean test accuracy (for ) exhibiting the effectiveness of this technique even for simple datasets.
Next, we trained the input discretized CIFAR10 and MNIST models with adversarial training to observe the improvement in adversarial accuracy compared to input adversarial training (Table II). Comparing to the results in Fig. 4(a, b), adversarial training substantially improves the robustness of a model with full input for larger values. Input discretization greatly furthers this robustness with accuracy gain across different perturbation ranges in both MNIST and CIFAR10. It is worth mentioning that the CIFAR10 accuracy (79%) without adversarial training for for input is as good as the accuracy (83%) with adversarial training. This shows that input discretization is a good regularization scheme that improves the generalization capability of a network on adversarial data. Note, for in case of CIFAR10, the accuracy is similar for since the adversarial training was conducted with adversaries created using . Including larger perturbation adversaries during adversarial training will yield improved accuracy gain.
Data  Model  Clean 




CIFAR10  2b  83.1  82.7  82.7  43.9  
8b  84.3  62.2  53.6  45.5  
MNIST  2b  98.5  98.5  84.7  85.4  
8b  98  84.8  74.5  65.9 
IvB Discretization of Parameter Space
Since input discretization gave us such promising results, we were naturally inclined toward analyzing a binarized neural network’s (BNN) behavior against adversarial attacks. Here, the weights and activations (or parameters) are discretized to extremely low precision values [6]. The discretization constraints are imposed on a BNN during training, wherein, the parameters are clamped to
after every backpropagation step. One can view this discretization as an implicit form of regularization. In fact, it is this extreme form of regularization that makes a BNN difficult to train (clean test accuracy observed with BNNs is, typically, lower than fullprecision networks). As suggested in
[14], the difficulty in training a BNN translates to difficulty in attacking the BNN as well. Referring to the data manifold intuition (Fig. 2), we can deduce that constraining the parameter space during a model’s training will introduce discontinuities and nonsmoothness in its decision boundary. Since adversaries are created using gradients of a model (that is a property of the model’s decision boundary), generating gradients (and hence adversaries) for nonsmooth functions will be difficult. This in turn will make a BNN less susceptible to adversaries. Note, the input image to a BNN is full precision.Fig.5(a) compares the adversarial accuracy obtained for varying values for CIFAR10 BNN (AlexNet architecture) against a similar architecture fullprecision network (with precision for weights and activations). We trained the networks for 40 epochs since BNNs require more training iterations to attain comparable accuracy as that of a fullprecision network. Here, we do not incorporate input discretization in our analyses. All networks are fed inputs. In Fig. 5(a), for , BNN shows better adversarial resistance (i.e. adversarial accuracy is closer to clean accuracy). However, the BNN’s accuracy declines steeply as we move toward larger perturbation ranges. We note a similar trend for MNIST (trained for 10 epochs on FCN2 architecture), wherein the fullprecision network yields improved robustness than the BNN for . These results contradict our intuition that increased discretization of BNNs should result in lesser adversarial susceptibility.
To understand this, we calculated the L1 norm of the first hidden layer activation of the FCN2 network in response to clean input images. We found that BNNs generally have a larger variance and range of values than fullprecision network. Since BNN uses weight values (+/1) which are typically of greater magnitude than the small weight values of a full precision network, we observe a larger range in the former case. Interestingly, we find that the L1 norm of the BNN activations (in response to adversarial images perturbed with lower
values) approximately lie within the same range as that of the clean input case. In contrast, L1 norm for higher adversaries have a much higher range. For a fullprecision network, the L1 norm range of the different adversaries and clean data typically intersect with each other owing to the lower weight values (Fig. 6). We believe that the extreme quantization of weight values in BNNs to higher magnitudes causes adversarial susceptibility for larger range perturbations. While the L1 norm analysis is not very substantial from a mathematical standpoint, it hinted us to increase the capacity (more neurons and weights) of the network. The motif here is that increasing the capacity would increase the overall range of activation values that might incorporate larger range perturbations. Exploding the network capacity for MNIST (FCN1 architecture) yielded a sizable improvement in adversarial resistance with BNN as compared to its corresponding fullprecision counterpart (Fig. 5(b)). This is a crucial detail of our analysis that: while BNNs are intrinsically robust to adversaries (for small ), only models with sufficient capacity can withstand against large values.Data  Model  Clean 




CIFAR10  BNN  79.7  53.1  43.6  35.3  
Full  82.7  72.2  63.6  55.5  
MNIST (FCN1)  BNN  96.9  89.1  74.5  65.8  
Full  98  84.8  71  61.7 
Even with adversarial training, we observed the same trend that binarized networks of insufficient capacity do not yield as good adversarial robustness as that of a fullprecision network (Table III). For CIFAR10, fullprecision network is the clear winner. While for MNIST (with excessive capacity FCN1 architecture), BNN yields improved robustness. A noteworthy observation here is that adversarial training substantially improves the robustness of a fullprecision network (see CIFAR10 results in Fig. 5(a) , Table III), while BNNs do not benefit much from them. In fact, we find that BNNs are difficult to train with adversarial training. The learning rate/other hyperparameters need to be tuned carefully to ensure that the BNN model converges to lower error values during adversarial training. [14] also observed a similar trend and explained that binarized weights are not as ‘malleable’ as fullprecision weights and hence cannot easily adjust to all possible variations of adversarial data augmented to the training dataset. We think that increasing the capacity of the network compensates for the ‘nonmalleability’ of the constrained parameters to certain extent. As a result, we see improved accuracy for MNIST in Table III with FCN1 architecture.
IvC Discretization of Input and Parameter
Space
Next, we combined both discretization strategies and analyzed the adversarial robustness of BNNs with varying imagelevel discretization. We compare the adversarial accuracy of BNNs to that of a fullprecision network for isoinput discretization scenarios, as shown in Table IV for CIFAR10 (AlexNet architecture trained for 40 epochs). In Table IV, BNN2b (Full2b) refers to a binarized (fullprecision) model with input precision. Full precision models have precision weights and activations. While input discretization for a fullprecision network suffers a sizeable accuracy loss, BNN’s accuracy fluctuation is marginal with a maximum of change. This is expected since BNNs (owing to +/1 binarized parameters) do not have as many dimensions (as a fullprecision network with weights and activations) to fit the extra information in the input data. Thus, BNNs fit , data likewise yielding similar generalization error. As opposed to the results seen earlier with inputs, BNNs with lower input precision () have significantly higher adversarial resistance than their fullprecision counterparts even for large values. Model capacity does not restrict the adversarial resistance in this case. This is an artefact of the twostep quantization that increases the minimum allowable perturbation to shift a data point.
We can also draw an alternate insight from this result: The constrained parameter space of BNNs restricts their overall exploration of the data manifold during training. Referring to Fig. 2 (b), this increases the probability of untrained or arbitrary hypervolumes (for BNNs) thereby increasing their adversarial susceptibility. Increasing the capacity enables a BNN to explore the manifold better during training. By discretizing the input, we are restricting the overall data manifold space. This allows a model, even, with lower capacity to explore the manifold well thereby decreasing the extent of arbitrary hypervolumes. Table
V illustrates the accuracy results for MNIST (FCN1 architecture trained for 20 epochs).Model  Clean 




BNN2b  81  80  80  36.7  
Full2b  82  79.1  79.3  39.6  
BNN4b  81.9  58.3  52.3  36.8  
Full4b  81.1  53.8  45.1  37.3  
BNN8b  81.5  55.1  45.9  38.6  
Full8b  86.5  61.1  55.2  47.1 
Model  Clean 




BNN2b  96.4  96.4  60.7  62.3  
Full2b  97.8  97.4  35.4  35.3  
BNN4b  96.4  88.9  76.7  58.7  
Full4b  98.1  71.1  50.9  33.7  
BNN8b  97.1  89.4  56.1  33.6  
Full8b  98.2  75.9  38.5  26.4 
We conducted adversarial training with input discretized BNNs to find out if it helps build adversarial robustness. The results are shown in Table VI. Comparing to the input BNN adversarial training results in Table III, we observe a substantial gain in adversarial accuracy. However, contrasting the BNN results against Table II ( input fullprecision networks), we observe similar performance gains. In fact, the accuracy gains for input CIFAR10 BNN with and without adversarial training (TableIV/Table VI) are nearly the same. Earlier, we saw that the accuracy (for low values) of a fullprecision network working on input data without adversarial training is similar to that of an adversarially trained network on inputs (Table II, Fig. 4). Combining the adversarial training results till now, we can deduce the following: 1) For low inputprecision () regime, adversarial training does not compound the adversarial resistance of a network (irrespective of binarized or fullprecision parameters), for lower values. Adversarial training helps when the input has higher () precision. 2) Input discretization, in general, offers very strong adversarial defense for lower values. Discretizing the input as well as the parameter space furthers adversarial robustness. Adversarial training in a discretized input and parameter space does not benefit much and hence can be waived. However, in case of stronger multistep attack scenarios and to gain robustness against larger perturbations (such as in CIFAR10, MNIST), the network needs to be adversarial trained with corresponding large values.
Data  Model  Clean 




CIFAR10  BNN2b  78.4  78.1  78.1  30.5  

BNN2b  95.7  95.7  89.3  88.6 
IvD Analysis on CIFAR100 and Imagenet
Scaling up the discretization analysis to larger datasets yielded similar results as observed with CIFAR10, MNIST. Fig. 7 demonstrates the adversarial accuracy evolution for CIFAR100 (trained on ResNet20 architecture for 164 epochs) for binarized XNOR ( weights and activations) networks and corresponding fullprecision ( weights and activations) models. Note, XNOR networks are similar to BNNs (1bit weights/activations) with additional scaling factors to achieve higher accuracy on complex datasets. It is evident that input discretization is the most beneficial to obtain adversarial robustness. input discretized models in both cases yield adversarial accuracy close to the clean accuracy () for a large range of perturbations. The accuracy loss between clean and adversary for input XNOR () is slightly better than the input fullprecision model (). This can be attributed to the intrinsic robustness offered by discretizing the parameter space of XNOR networks. Furthermore, the fact that input XNOR yields higher adversarial accuracy for isoperturbation values than input fullprecision model further demonstrates the ability of binarized networks to counter adversarial attacks. A noteworthy observation here is that the loss in clean accuracy between and input discretized fullprecision network is small () as compared to the large loss observed earlier with CIFAR10 (Table I). As we scale up the complexity of the dataset, the redundancies in the input dimensions increase. Discretizing the input for a complex dataset thus eliminates more redundant dimensions that do not contribute to the accuracy. In contrast, smaller datasets have lesser redundant dimensions and are thus at a risk of suffering a large accuracy drop with input discretization.
Fig. 8 shows the accuracy results for Imagenet (trained on AlexNet). We only show the top5 adversarial accuracy. We see similar trends as CIFAR100. Note, the XNOR models are trained for 50 epochs, while fullprecision models are trained for 90 epochs. As a result, we see lower baseline accuracy () in the former case. Like CIFAR100, the loss in clean test accuracy between and input discretization is minimal for each model. Also, the accuracy difference between clean and adversarial data for () is much lower than precision models (). This highlights the intrinsic robustness capability of binarized networks even for largescale datasets.
V Conclusion
Lowprecision models or quantization techniques, so far, have been explored to reduce the resource utilization of DLNs for energyefficient deployment on edge devices. We have demonstrated that discretization also warrants security against adversarial attacks, thereby, offering a key benefit of robustness in hardware implementation. In summary, the main findings/recommendations from this work are:

Input discretization is major benefactor for adversarial robustness (with both binarized and fullprecision) models. input discretized models (without adversarial training) yield similar adversarial accuracy as adversarially trained input models for lower values. Robustness against higher and multistep attack requires adversarial training.

Binarized (lowprecision weights/activations) models are intrinsically more (although, slight) robust than fullprecision ( weights/activations) models. Adversarial training needs to be carefully done on sufficient capacity binarized networks to attain similar adversarial robustness as the fullprecision models.

Combining input and parameter discretization is an efficient way of obtaining adversarial robustness to a moderate range of perturbation values without conducting the iterative adversarial training.
Our work unravels a simple idea that: hardware optimization related techniques can potentially resolve or resist software vulnerabilities (specifically, adversarial attacks). While we focus on discretization, there is a lot of future scope to explore other efficiencydriven techniques (such as, stochasticity, model pruning etc.) to gauge their implication on adversarial robustness.
Acknowledgment
This work was supported in part by CBRIC, Center for Braininspired Computing, a JUMP center sponsored by DARPA and SRC, by the Semiconductor Research Corporation, the National Science Foundation, Intel Corporation and by the Vannevar Bush Faculty Fellowship.
References
 [1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.
 [2] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv preprint arXiv:1607.02533, 2016.
 [3] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples. corr (2015).”
 [4] P. Panda and K. Roy, “Explainable learning: Implicit generative modelling during training for adversarial robustness,” arXiv preprint arXiv:1807.02188, 2018.
 [5] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
 [6] I. Hubara, M. Courbariaux, D. Soudry, R. ElYaniv, and Y. Bengio, “Binarized neural networks,” in Advances in neural information processing systems, 2016, pp. 4107–4115.

[7]
M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnornet: Imagenet classification using binary convolutional neural networks,” in
European Conference on Computer Vision
. Springer, 2016, pp. 525–542.  [8] C. Guo, M. Rana, M. Cisse, and L. van der Maaten, “Countering adversarial images using input transformations,” 2018.
 [9] J. Buckman, A. Roy, C. Raffel, and I. Goodfellow, “Thermometer encoding: One hot way to resist adversarial examples,” 2018.
 [10] J. Chen, X. Wu, Y. Liang, and S. Jha, “Improving adversarial robustness by dataspecific discretization,” arXiv preprint arXiv:1805.07816, 2018.
 [11] W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples in deep neural networks,” arXiv preprint arXiv:1704.01155, 2017.

[12]
Y. LeCun, “The mnist database of handwritten digits,”
http://yann. lecun. com/exdb/mnist/, 1998.  [13] A. Krizhevsky, V. Nair, and G. Hinton, “Cifar10 and cifar100 datasets,” URl: https://www. cs. toronto. edu/kriz/cifar. html (vi sited on Mar. 1, 2016), 2009.
 [14] A. Galloway, G. W. Taylor, and M. Moussa, “Attacking binarized neural networks,” ICLR, 2018.
 [15] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” ICLR, 2018.

[16]
A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,”
ICLR, 2016.  [17] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” ICLR, 2018.
 [18] https://github.com/itayhubara/BinaryNet.pytorch.

[19]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in
Proceedings of the IEEE conference on computer vision and pattern recognition
, 2016, pp. 770–778.  [20] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. FeiFei, “Imagenet: A largescale hierarchical image database,” in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. Ieee, 2009, pp. 248–255.
 [21] https://github.com/jiecaoyu/XNORNetPyTorch.
Comments
There are no comments yet.