Certification of Semantic Perturbations via Randomized Smoothing

02/27/2020
by   Marc Fischer, et al.
0

We introduce a novel certification method for parametrized perturbations by generalizing randomized smoothing. Using this method, we construct a provable classifier that can establish state-of-the-art robustness against semantic perturbations including geometric transformations (e.g., rotation, translation), for different types of interpolation, and, for the first time, volume changes on audio data. Our experimental results indicate that the method is practically effective: for ResNet-50 on ImageNet, it achieves rotational robustness provable up to ± 30^∘ for 28

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

12/20/2019

Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing

It is well-known that classifiers are vulnerable to adversarial perturba...
02/08/2019

Certified Adversarial Robustness via Randomized Smoothing

Recent work has shown that any classifier which classifies well under Ga...
02/27/2020

Provable Robust Learning Based on Transformation-Specific Smoothing

As machine learning systems become pervasive, safeguarding their securit...
11/28/2020

Deterministic Certification to Adversarial Attacks via Bernstein Polynomial Approximation

Randomized smoothing has established state-of-the-art provable robustnes...
08/01/2021

Certified Defense via Latent Space Randomized Smoothing with Orthogonal Encoders

Randomized Smoothing (RS), being one of few provable defenses, has been ...
09/22/2021

CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks

In safety-critical machine learning applications, it is crucial to defen...
08/29/2020

Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More

Existing techniques for certifying the robustness of models for discrete...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep neural networks are vulnerable to adversarial examples

(SzegedyAdvExamples) – semantical preserving changes such as -noise, geometrical perturbations (e.g., rotations and translation) (EngstromMadryRotation), and Wasserstein perturbations (WongWasserstein) which can affect the output of the network in undesirable ways. This is especially problematic when these models are used in safety critical tasks such as medical diagnosis (AmatoMedical) or autonomous driving (BojarskiAutonomous).

As a result, recent work (e.g., AI2; Weng2018) started investigating robustness certification methods which guarantee the absence of adversarial examples. However, even with training methods tailored to produce networks amenable to -certification (WongK18; DiffAI), current verification techniques still cannot scale to realistic models and datasets. Recently, a promising approach called randomized smoothing was proposed by (CohenRK19) – it works by constructing a probabilistic classifier with probabilistic certificates and produces state-of-the-art results for -norm bounded noise on ImageNet.

This work

In this work we generalize randomized smoothing to parameterized semantic perturbations (beyond ). For example, our method enables probabilistic certification of geometric perturbations (e.g., rotations, translations), which is challenging due to the need for interpolation and rounding. Prior work on this topic is limited in either expressivity or scalability: PeiCYJ17 is restricted to the nearest neighbor interpolation and exhaustive enumeration, while DeepPoly; DeepG allow more complex interpolations (e.g., bilinear, bicubic) but handle only smaller networks. Our generalization of randomized smoothing overcomes these limitations: it enables certification of geometric perturbations on large networks and with complex interpolations. We illustrate the idea in Fig. 1 where we sample different angles and, by our theorem, obtain a robustness certificate for rotations. Crucially, to be sound, this certificate takes into account the interpolation error, which we overcome by incorporating a -certified classifier.

We remark that to model a realistic attacker, our method also considers quantization errors from limited precision and does not rely on continuous pixel values. This is important as it means the method is sound for pixel values that are integers or floats (which is how images are actually represented).

Figure 1: A network classifies an image correctly, but fails to classify the same image rotated by . Our method creates and certifies a smoothed classifier, by sampling rotations.

Main contributions

Our key contributions are:

  • A generalization of randomized smoothing to parameterized semantic perturbations.

  • The first scalable and sound certification method for semantic perturbations, such as rotations and translations, that can be applied to ImageNet images. The method is general and works with any standard interpolation (e.g., bicubic, bilinear) and types of pixel values (e.g., integers, floats).

  • A thorough evaluation of the proposed method on image and audio datasets, establishing state-of-the art results in both domains.

2 Related Work

We now survey the most closely related work in exact and probabilistic certification, defenses as well as perturbations and certification beyond norm-based noise.

norm based certification and defenses

The discovery of adversarial examples (SzegedyAdvExamples; BiggioAdvExamples) triggered interest in training robust neural networks. Empirical defenses are a common way to harden a model against an attacker, by adversarially attacking images during training (AdversarialTraining; MadryTraining). However, while adversarially trained networks may be robust to adversaries, the robustness usually cannot be formally verified with current verification methods. This is because complete methods (Ehlers17planet; Reluplex; bunel18nips) do not scale and non-complete methods relying on over approximation lose too much precision and cannot prove true properties (AI2; WangSafety; Weng2018; RaghunathanSL18a; DeepPoly; SalmanBarrier).

To address this issue, provable training methods have been developed, aimed at producing networks that are amenable for certification (DiffAI; RaghunathanSL18b; WangSafety; WongK18; IBP; Bridging). Currently, these methods do not scale to train large enough networks with state-of-the-art accuracy (e.g., ImageNet).

Recently, randomized smoothing was introduced, which could for the first time, certify a (smoothed) classifier against substantial norm bound noise on ImageNet (Lecuyer2018CertifiedRT; LiSampling; CohenRK19; Salman; Macer), by relaxing exact certificates to high confidence probabilistic ones. Smoothing has the advantage that it scales to large models, however, it can suffer from an added overhead during inference time, and is currently limited to norm-based perturbations.

Semantic perturbations

Transformations, such as translations and rotation, can produce adversarial examples (EngstromMadryRotation; KanbakMF18). PeiCYJ17 were first to certify against such semantic preserving operations on images by enumeration. They reduce the search space by only considering next neighbor interpolation. However, enumeration does not scale to continuous interpolations or fine-grained encodings, such as volume changes of 16-bit audio data. DeepPoly were the first to support certification of rotations for bilinear interpolation, which was significantly improved on by (DeepG). Both methods generate linear relaxations and propagate them through the network. However, these methods do not yet scale to large networks (i.e., ResNet-50) or complex data sets (i.e., ImageNet). We remark that this work is a continuation of our prior work (fischer2020statistical).

3 Background

We now discuss the necessary background on both randomized smoothing and interpolation.

Randomized smoothing

A smoothed classifier can be constructed out of an ordinary classifier mapping points in to labels in

, by calculating the most probable result of

where :

In practice, it is intractable to calculate the probabilities analytically, hence we estimate the integral up to a chosen confidence by sampling. One then obtains the following robustness guarantee:

Algorithm 1 for certification   # certify the robustness of around   function Certify(, , , , , )         top index in counts0        (, , )    if return and    else return ABSTAIN Figure 2: Rotations with interpolation do not compose.
Theorem 3.1.

Suppose , . If

then for all satisfying

In practice we use Algorithm 1 with in order to obtain the above guarantee. We say we “smooth” over a variable or a classifier when we apply Certify.

Interpolation and rounding

Applying a geometric transformation (e.g., rotation) results in a transformed pixel grid which does not align with the original one. Thus, to obtain the pixel values of the transformed image, interpolation is needed. Typical interpolation algorithms for images include nearest-neighbor interpolation, bilinear interpolation and bicubic interpolation (see Appendix A for details).

We denote the rotation by an angle and subsequent interpolation of an image by . The interpolation step consists of resampling (the actual interpolation e.g., bilinear) and rounding the pixel values back to the used underlying data type (e.g., integers in ). It is important to note that rotations with interpolation do not compose, that is, . This is because rotation and interpolation with rounding do not commute (Fig. 2).

Similarly to images, when transforming a 16-bit audio signal, the result can be in floating point space. Thus, it needs to be rounded to be expressible in 16-bit integers again, which introduces rounding errors.

Figure 3: Outline of our approach. We rotate the input by degrees, , apply preprocessing, and classify them. This allows us to certify robustness against an attacker who can rotate the input up to degrees.
Figure 4: In the outline (Fig. 3) we use a certified classifier to handle interpolation noise; here randomized smoothing.

4 Generalization of Smoothing

We now generalize randomized smoothing (CohenRK19) to parameterized transformations. We consider composable transformations , that is, we have that for . We will show in Section 5 how to handle non-composable transformations.

Definition 4.1.

Given a base classifier and a transformation , we define a smoothed classifier by

We now obtain the following robustness guarantee:

Theorem 4.2.

Let , be a classifier and be a composable transformation as above. If

then for all satisfying

The proof is similar to the one by CohenRK19 and is given in Appendix B. The key difference to CohenRK19 is that we allow parameterized transformations , while CohenRK19 only allows additive noise.

Lemma 4.3.

If we replace with a classifier , behaving with probability the same as and with probability differently than and if

then for all satisfying

Proof.

By applying the union bound we can relate the output probability of for a class with the output probability of and :

Thus we can obtain new bounds and from and measured on . Plugging these bounds in Theorem 4.2 yields the result. ∎

This lemma allows us to smooth over erroneous classifiers like already smoothed classifiers.

Similar to (CohenRK19), both Theorem 4.2 and Lemma 4.3 can be instantiated with to obtain and , respectively.

In practice, both statements hold with a certain probability as we have a finite amount of samples to estimate a lower bound of , and an upper bound of , . Algorithm 1 shows the Certify procedure, which can be used to perform this in practice. The LBound method uses Clopper-Pearson bounds to estimate with confidence . The given algorithm returns either both a class and a radius if , or abstains from classification. To perform inference with , it suffices to pick fewer samples and perform a statistical test with confidence whether more samples of class than class got selected.

5 Semantic Perturbations

We now discuss several practical semantic perturbations , first in an idealized setting, that is, without interpolation or rounding, after which we explain how to handle these in the realistic case.

5.1 Idealized Setting

Rotation

Rotations by an angle compose:

Many other geometric transformations such as translations and scaling also compose.

Volume

The volume of an audio signal can be changed by multiplying the signal with a constant. In order to change the signal by (measured in decibel ) we multiply by . Thus the transformation is , which also composes:

5.2 Realistic setting

We illustrate the difficulties introduced by handling interpolation using the example of interpolation with rotation. Our proposed method works with all interpolations. Rounding errors can be handled analogously.

Interpolation and rounding error

Recall that rotation with interpolation does not compose (Fig. 2). Thus smoothing with realistic rotations is not enough as this would not compose with an attacker performing realistic rotations of an angle . To address this, we regard the difference between and as noise:

with for some (computation of is discussed Section 6).

The key idea is to smooth out (with rotations) a classifier (Fig. 4) that is certifiably robust against noise (Fig. 3). For (without ) and base classifier , we can apply Theorem 4.2 to

where , and obtain the guarantee that for all (mathematical) rotations satisfying , by using

that

Because is robust around , we know that adding s.t. does not change the class predicted by . Thus we get

Here, the statement also holds for the specific , namely , resulting in

which can be rewritten using the definition of to obtain the desired safety property

6 Certification of Classifiers

We next explain how to apply the discussed techniques in practice, so to obtain robustness certificates against rotations. Other transformations (e.g., translations) work analogously.

Attacker Model and Calculating

A key to our proposed approach is to find a good bound . For arbitrary images, the norm of the interpolation error can be large, but for realistic images , where is the data distribution, the norm is typically lower. To exploit this, we give probabilistic bounds on the norm for .

In this work we assume an attacker that applies a rotation of angle . We can pick a suitable and compute a probabilistic upper bound by computing

where

describes the uniform distribution over

. Using the Clopper-Pearson interval, we can estimate a lower bound of for a given with confidence , i.e., .

Using this provides a sound certification against an attacker that chooses the angle rotation angle randomly. To defend a against a malicous attacker one needs to bound

which can be achieved by using a similar approach to DeepG (Appendix A.4): The range can be divided into small intervals and subsequently used to calculate an interval over approximation capturing all images obtained through rotations by angles in using interval analysis. From these over approximations we can then directly obtain can upper bound for , where and can be sampled as before.

We note that these bounds can be obtained through precomputation on the data set.

6.1 Preprocessing to reduce

Because the estimates of at a satisfactory error rate are still larger than the radii most certification methods can certify, we add preprocessing to the images before the classifier classifies them.

Low-pass filter

Commonly, noise is a high-frequency artifact in the data. As interpolation noise behaves similarly, we choose a low-pass filter (blur) as pre-processing, to reduce the norm bound

the classifier needs to handle. The low-pass filter (i) calculates the two dimensional discrete fourier transform, (ii) filters the calculated frequencies by discarding the highest frequencies (thus part of the noise) up to a threshold and (iii) calculates the inverse two dimensional discrete fourier transform on the filtered frequencies.

Figure 5: Difference between one rotation and two consecutive rotations. With out vignetting (upper row) and with vignetting (lower row).

Vignetting

Another issue impacting is that rotation of a digital image induces black corners to the image. Thus rotating twice can lead to very large differences. To sidestep this issue we introduce a second pre-processing step, namely vignetting, which sets the pixel values of all pixels outside a circle to 0. This significantly reduces the interpolation error as can be seen in Fig. 5.

6.2 Smoothing smoothed classifiers

The only certification method able to handle noise large enough for our setting currently is randomized smoothing. Thus we need to smoothen (Fig. 3) over an already smoothed classifier (Fig. 4).

Double-smoothing

To do that, we invoke Algorithm 1 twice in order to smooth the smoothed classifier. The Sample procedure of the outer classifier (Fig. 3), which smooths the angle , invokes the Certify procedure of the inner classifier Fig. 4, which smooths the interpolation noise . To clarify the notation, we add the subscript and to the variables of the algorithm.

To obtain the certified radius we need to smooth with samples, and for each of these samples we need to smooth the interpolation noise with samples. While this can amount to a very large number of samples (), we find that relatively few samples in practice suffice to guarantee with high confidence . The constants and can be chosen small compared to and and thus we neglect its cost.

Smoothing over probably certified classifiers

Since both the bound and the underlying classifier are probabilistic in nature, we need to use Lemma 4.3 to consider the cases in which the base classifier can be wrong. This is the case if the guaranteed radius of the classifier is incorrect or the bound is incorrect. Thus,

where is the confidence of the base classifier, is the probabilistic guarantee for and is the confidence with which was obtained.

6.3 -smoothing

For some data sets (i.e., CIFAR-10), the bounds for , even after preprocessing (low-pass filter and vignetting) are not small enough to produce satisfactory results. Interpolation noise usually is not concentrated, similarly to noise. While it would at first sight make sense to use certification methods to certify against noise, currently, there is no method that can directly certify CIFAR-10 against noise up to 28/255 to a satisfactory degree for an accurate model. Thus we certify against noise.

Noise partitioning

To improve this, instead of calculating an bound on the noise, we calculate two bounds: for the noise on the left half of the image, and , for the noise on the right part of the image. First, we split the noise into the noise applied to the left side of the image , and into the noise applied to the right side of the image such that

Next, we estimate the upper bounds for and for . Lastly, we construct a classifier that certifies and then use this as a base classifier to construct , which certifies . Thus is certified for both and . Formally, the outer classifier is

where is the double smoothed classifier (by Lemma 4.3) which smooths the left and right side noise consecutively

where again, Here, denotes the base classifier we smooth over.

Dataset
MNIST bil.
MNIST bic
ImageNet bil.
ImageNet bil.
ImageNet bic.
ImageNet bic.
ImageNet near.
ImageNet near.
Dataset
CIFAR bil.
CIFAR bil.
CIFAR bic.
CIFAR bic.
GTSRB bil.
GTSRB bil.
GTSRB bic.
GTSRB bic.
Table 1: Bound for the interpolation error of rotations. All errors are estimated with . We used for MNIST and for the others. indicates the highest observed error during sampling. On MNIST and ImageNet we calculated on the whole image, while on CIFAR and GTSRB we calculated the error component per color channel. Values for Restricted ImageNet are ommited as they are similar to those for ImageNet.

Benefit of noise partitioning

If the noise is evenly spread then the norm of the left (right) half of the noise is . Thus, we improve by a factor of . One can also partition the image differently, i.e., color-channels or quadrants and smooth over each part in composition. The number we need to sample in the case for rotations, is to smooth ; for each of these samples we need to smooth using samples and again for each of these samples, we need to smooth using samples. Thus, in total, we need samples.

Rotation percentile
Dataset Acc. Acc. T [s]
MNIST bil. 0.99 0.81 86.64 10.15 200 200
MNIST bil. 0.99 0.81 31.00 55.22 10.11 200 200
MNIST bic. 0.99 0.70 95.18 10.13 200 200
MNIST bic. 0.99 0.82 32.36 56.77 153.11 10.13 50 2000
RImageNet bil. 0.84 0.78 15.64 16.01 16.01 128.65 50 2000
RImageNet bil. 0.84 0.68 27.46 160.655 50 2000
RImageNet bic. 0.84 0.82 10.39 10.39 10.39 124.02 50 2000
RImageNet bic. 0.84 0.69 26.29 115.655 50 2000
ImageNet bil. 0.39 0.29 10.81 10.81 10.81 128.95 50 2000
ImageNet bil. 0.39 0.29 18.29 18.29 18.29 720.93 300 2000
ImageNet bil. 0.39 0.28 9.09 16.59 28.60 128.21 50 2000
ImageNet bil. 0.39 0.28 20.22 25.36 753.72 50 2000
ImageNet bic. 0.39 0.29 10.40 10.40 10.40 143.75 50 2000
ImageNet bic. 0.39 0.27 9.33 17.00 28.74 141.59 50 2000
ImageNet near. 0.39 0.29 9.62 9.62 9.62 118.28 50 2000
ImageNet near. 0.39 0.26 7.38 16.63 27.72 118.53 50 2000
GTSRB bil. 0.68 0.14 12.61 12.61 12.61 661.50 20
CIFAR bil. 0.80 0.12 5.19 5.19 10.08 20
Table 2: We evaluated rotation on MNIST, RImageNet and ImageNet on samples each and CIFAR and GTSRB on each. indicates that we could prove a larger radius, but clipped the value due to the attacker model, as otherwise the noise estimate might be incorrect. mean not applicable as we evaluate on a different machine, but similar to GTSRB.

7 Evaluation

We now present a through evaluation of our proposed method, showing results for rotations, translations and audio volume change. All experiments were performed on a machine with 2 GeForce RTX 2080 Tis and a 16-core Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz.

7.1 Rotation of Images

We evaluate the robustness on ImageNet (ImageNet), Restricted ImageNet (RImageNet)(TsiprasSETM19), CIFAR-10 (CIFAR) (cifar), the German Traffic Sign Recognition Benchmark (GTSRB) (GTSRB), and MNIST (MNIST). For MNIST, CIFAR and GTSRB we use a ResNet-18 (resnet) and for (R)ImageNet a ResNet-50. To account for interpolation noise, we use an -smoothed classifier trained with SmoothAdvPGD (Salman) and rotations as data augmentation. Details of the models and the training procedure is given in Appendix C.

Attacker Model and Noise Bound

Figure 6: Histogram of the interpolation error for rotation on ImageNet (blue) and the images in ImageNet with shorter side (red; counts scaled up by ).

The bound on the interpolation error can be computed once we fix , the range from which the attacker chooses their angle and

, the standard deviation of the Gaussian used to smooth over it, as well as the interpolation algorithm. The computed bounds are given in

Table 1. For MNIST, we consider an attacker to perform rotations in and for all other datasets . For datsets with small image resolution (MNIST, CIFAR, GTSRB), we only evaluate bilinear and bicubic interpolation, as nearest neighbor interpolation could be trivially enumerated. While it has also been shown that enumeration can also be applied to ImageNet (PeiCYJ17), we show that our approach can be an efficient alternative for larger angels. For all datasets we employ a low-pass filter and vignetting to further reduce . On all datasets we assume that the image is saved in a loss-less integer format by the attack and account for this in the estimate of . For CIFAR and GTSRB, we used -smoothing and estimate the interpolation error for each color channel.

We observed the interpolation error to increase relative to the size of the image for small images. This is problematic for two reasons, (i) it is harder to certify large noise radii, and (ii) as noted by CohenRK19, smoothing with Gaussian Noise performs worse on smaller images. To address this issue, we assume images from CIFAR to be resized to (from ) prior to rotation (i.e., this is part of the attacker model); for GTSRB, which has images ranging in size from to , we use the same scheme as for CIFAR and for (R)ImageNet (where images range in size from to ) we resize all images prior to rotations such that their shorter side is pixels. This simulates a dataset where each image has at least a resolution of , which we believe to be reasonable given current hardware. Thus our attacker model formally is an attacker that applies rotation of degrees, on an image of at least pixels, that follow the data distribution. For error estimation and classification we resize the images back down to for GTSRB and CIFAR and for (R)ImageNet such that the shorter side is pixel long and take a crop of the center, the common preprocessing for this datset (AlexNet). Values in Table 1 take prepossessing into account. Fig. 6 shows the interpolation errors on ImageNet in blue for all images in red for images where the shorter side is naturally longer than . This indicates that we don’t measure an effect the resclaing but rather the image size.

Entries of Table 1 marked with denote the maximal error measured for the dataset and the perturbation. However, they lie just outside what we could prove with reasonable effort while retaining accuracy. Thus in practice we use a slightly smaller , that still covers of possible interpolation errors. This approach soundly certifies against attacks from an attacker that randomly chooses an angle, as we observed that errors beyond the chosen are rare and empirically where not concentrated on particular images. Section 6 discusses the estimation of for a sound classifier. First estimate for these values show that they are similar to the values reported in Table 1.

Translation percentile
Dataset Acc. Acc. T [s]
MNIST bil. 0.99 0.96 10.11 200 1000
MNIST bic. 0.99 0.98 10.12 200 1000
RImageNet bil. 0.80 0.79 66.75 50 2000
RImageNet bic. 0.80 0.79 147.35 50 2000
ImageNet bil. 0.48 0.36 50 2000
ImageNet bic. 0.48 0.36 149.40 50 2000
Table 3: We evaluated translation on 100 samples each. indicates that we could prove a much larger radius, but clipped the value due to the attacker model, as otherwise the noise estimate might be wrong.

Results

We evaluated our algorithm on the non-starred values in Table 1. The results are shown in Table 2: Acc. and Acc. denote how many images have been successfully classified respectively. All images classified by are certified with a radius . For ImageNet and RImageNet as well as MNIST we observe very good results, as these datasets either have very large images or are very simple and can sometimes prove radii that are larger than what we used as possible ranges for the attacker when estimating . We have clipped these values back to what we assumed in the attacker model and indicate this by . For GTSRB and CIFAR, the images are yielding higher interpolation errors but at the same time are less robustness to noise. Thus it becomes harder to construct a smoothed classifier for these datasets. This required us to drop the confidence from to and apply smoothing over each color channel separately. DeepG certify 87.8% on MNIST, (, 35s per image) and again 87.8% on CIFAR-10, (, 117s per image), with out taking rounding into account. PeiCYJ17 need on degrees 714s per image and report the failure rate per image, hence not comparable. We can certify 81% on MNIST, (often with , 10s per image) and 69% on RImageNet, (often with , 231s per image). Parameter choices are discussed in Appendix C.

Table 4: Bound for the interpolation error of translations. Estimated with .Values for RImageNet are omitted as they are similar to those for ImageNet.
Dataset
MNIST bil. 4 0.3 92.60
MNIST bic. 4 0.3 96.47
ImageNet bil. 50 0.9 97.00
ImageNet bic. 50 1.25 96.36

7.2 Translation

Similarly to rotations, we apply preprocessing calculate estimates for in Table 4. As ImageNet images get cropped before classification and MNIST images have black background, we do not use vignetting. Since the interpolation errors for translation are slightly higher than for rotation, we only consider MNIST and (R)ImageNet. On the other datasets the error becomes infeasible large. Also we do not consider nearest neighbor interpolation as it can be easily enumerated here. Results are shown in Table 3. The radius is given in percent of image size. on MNIST corresponds to pixel and on 2000 pixels to 48. Since we obtain a -bound, the radius reads as where and are changes in respective directions. DeepG certify 77% on MNIST, ( pixels, 263s per image). We significantly improve this result.

7.3 Volume Change of Audio Signals

To evaluate our method on audio data we use the speech commands dataset (gcommands), which are recordings of up to 1 second of people saying one of 30 different words, which are to be classified. Our audio experiments are similar to those for images, although here we do not employ any resizing or scaling. We use a classification pipeline that converts audio wave forms into MFCC spectra (MFCC) and then treats these as images and applies normal image classification. We use a ResNet50, that was trained with Gaussian noise, but not SmoothAdvPGD. We apply the noise before the wavefrom is converted to the MFCC spectrum.

We estimate with to be and . On 100 samples, the base classifier was correct times, and the smoothed classifier 51 times for of 0.75, 1.96 and 3.12 for the , , percentile respectively, corresponding to , and dB. At and the average certification time was .

7.4 Discussion

Since the success of our approach largely depends on the size of the underlying images we are in an unusual situation where it is easier for us to prove statements on ImageNet than CIFAR. Accuracy of our robust classifier is mostly limited by the large estimate of and the quality of the underlying certification. Thus any advances in robustness or estimating lower (sound) E (e.g., by computing it per image) directly translate into improvements of our method. The choice of trades accuracy with certification radius as seen in Table 2. Finally, using more samples and further splits for -smoothing, the error bound can be pushed further, at the cost of compute time and accuracy.

8 Conclusion

In this work we presented a novel method that extends the Gaussian Smoothing framework of CohenRK19 to semantic parameterized perturbations (beyond -balls) in domains such as image (including ImageNet) and audio classification. The framework is general and can be directly applied as-is to standard semantic perturbations with all interpolation schemes while being sound for different types of pixel values. We believe the generality of the method will trigger further work in this direction.

References

Appendix A Interpolations

In this section, we discuss three common interpolation methods used to compute a pixel value on a unit square lying between 4 pixels of an image. We describe the image around the unit square by a function . We denote the interpolation on the unit square by .

Nearest neighbour interpolation

Nearest neighbor interpolation assigns a point in the unit square the value of the nearest corner point of the unit square, that is,

Bilinear interpolation

Bilinear interpolation is described by a multivariate polynomial such for all , that is

Bicubic interpolation

We know the values for . Further we define

for . We can now fit the multivariate polynomial such that , , and for all .

Appendix B Proof of Theorem 4.2

We present and proof a slightly more general version of Theorem 4.2.

Theorem.

Let , be a classifier, a composable transformation for with a symmetric, positive-definite covariance matrix . If

then for all satisfying

Proof.

The assumption is

By the definition of we need to show that

We define the set . We claim that for , we have

(1)
(2)

First, we show that Eq. 1 holds.

Thus Eq. 1 holds. Next we show that Eq. 2

holds. For a random variable

we write for the evaluation of the Gaussian cdf at point .