1 Introduction
Since the discovery of imperceptible input perturbations that can fool machine learning models, called adversarial examples
(BiggioCMNSLGR13; szegedy2013intriguing), certifying model robustness has been identified as an essential task to enable their application in safety-critical domains.Various works have discussed the fundamental trade-off between robustness and accuracy in the empirical setting (Raghunathan19AdvCanHurt; TsiprasSETM19; zhang2019theoretically). However, in the setting of deterministically certified robustness, this Pareto frontier has only recently been explored (mueller2021certify). There, due to the poor scaling of deterministic methods to large networks, performance on more challenging tasks is severely limited. In the probabilistic certification setting, recent works aim to jointly increase robustness and accuracy by choosing smoothing parameters per sample (Alfarra20DataDependent), however often at the cost of statistical soundness (Sukenik21Intriguing).
In this work, we build on ideas from mueller2021certify to construct compositional architectures for probabilistic certification and propose corresponding statistically sound and efficient inference and certification procedures based on randomized smoothing (CohenRK19)
. More concretely, we propose to use a smoothed selection-mechanism that adaptively chooses on a per-sample basis between a robustified smoothed classifier and a non-robust but highly accurate classifier. We show that the synergy of RS with the proposed compositional architecture allows us to obtain significant robustness at almost no cost in terms of natural accuracy even on challenging datasets such as ImageNet while fully exposing this robustness-accuracy trade-off, even after training.
Main Contributions Our key contributions are:
-
[labelindent=1.9em,labelsep=0.25cm,leftmargin=*]
-
We are first to extend compositional architectures to the probabilistic certification setting, combining an arbitrary deep model with a smoothed classifier and selection-mechanism.
-
We investigate two selection-mechanisms for choosing, at inference time and on a per-sample basis, between a robust and an accurate classifier and derive corresponding statistically sound prediction and certification algorithms.
-
We conduct an extensive empirical investigation of our compositional architectures on ImageNet and CIFAR10 and find that they achieve significantly more attractive trade-offs between robustness and accuracy than any current method. On ImageNet, we, e.g., achieve more natural accuracy at the same ACR or more ACR at the same natural accuracy.
2 Background & Related Work
In this section, we review related work and relevant background.
Adversarial Robustness & Threat Model
Let be a classifier computing an
-dimensional logit vector, assigning a numerical score to each of the
classes, given a -dimensional input. Additionally, let with be the function that outputs the class with the largest score. On a given input with label , we say is (accurately) adversarially robust if it classifies all inputs in a -norm ball of radius around the sample correctly: . We distinguish between empirical and certified robustness. Empirical robustness is computed by trying to find a counterexample such that ; it constitutes an upper bound to the true robust accuracy. Certified robustness, in contrast, constitutes a sound lower bound. We further distinguish probabilistic and deterministic certification: Deterministic methods compute the reachable set for given input specifications (katz2017reluplex; GehrMDTCV18; RaghunathanSL18a; ZhangWCHD18; singh2019abstract) to then reason about the output. While providing state-of-the-art guarantees for specifications, these methods are computationally expensive and typically limited to small networks. Probabilistic methods (LiCWC19; LecuyerAG0J19; CohenRK19) construct a robustified classifier and obtain probabilistic robustness guarantees by introducing noise into the classification process, allowing the certification of much larger models. In this work, we focus on probabilistic certification and an -norm based threat model. Extensions to other threat models are orthogonal to our approach.Randomized Smoothing
Randomized Smoothing (RS) (CohenRK19) is one of the most popular probabilistic certification methods. The key idea is to generate many randomly perturbed instances of the same sample and to then conduct majority voting over the predictions on these perturbed samples. More concretely, Randomized Smoothing constructs the smoothed classifier by conducting majority voting over a random noise term :
(1) |
For this smoothed classifier , we obtain the following robustness guarantee:
Theorem 2.1.
(CohenRK19). Let , , and . If
(2) |
then for all satisfying with .
Where
is the inverse Gaussian CDF. The expectation and probabilities in
Eqs. 2 and 1, respectively, are computationally intractable. Hence, CohenRK19 propose to bound them using Monte Carlo sampling and the Clopper-Pearson lemma (clopper34confidence). We denote obtaining a class and radius fulfilling Theorem 2.1 as certification and just obtaining the class as prediction. In practice, both are computed with confidence . When this fails, we abstain from making a classification, denoted as . Performance is typically measured in certified accuracy at radius () and average certified radius over samples (ACR). We focus on their trade-off with natural accuracy (NAC) and provide detailed algorithms and descriptions in App. A.Trade-Off
For both empirical and certified methods, it has been shown that there is a trade-off between model accuracy and robustness (zhang2019theoretically; XieTGWYL20; Raghunathan19AdvCanHurt; TsiprasSETM19). In the case of RS, the parameter provides a natural way to trade-off certificate strength and natural accuracy (CohenRK19; Mohapatra21HiddenCost).
Compositional Architectures For Deterministic Certification (Ace)
To enable efficient robustness-accuracy trade-offs for deterministic certification, mueller2021certify introduced a compositional architecture. The main idea of their Ace architecture is to use a selection model to certifiably predict certification-difficulty, and depending on this, either classify using a model with high certified accuracy, , or a model with high natural accuracy, . Overall, the Ace architecture is defined as
(3) |
mueller2021certify propose two instantiations for the selection-mechanism, : a learned binary classifier and a mechanism selecting if and only if the entropy of its output is below a certain threshold. In order to obtain a certificate, both and must be certified.
3 Robustness vs. Accuracy Trade-Off via Randomized Smoothing
Here, we introduce Aces which instantiates Ace (Eq. 3) with Randomized Smoothing by replacing and with their smoothed counterparts and , respectively:
(4) |
Note that, due to the high cost of certification and inference of smoothed models, instantiating with significantly larger models than and comes at a negligible computational cost.
Prediction & Certification
Just like other smoothed models (Eq. 1), Aces (Eq. 4) can usually not be evaluated exactly in practice but has to be approximated via sampling and confidence bounds. We thus propose Certify (shown in Algorithm 1) to soundly compute the output and its robustness radius . Here, SampleWNoise() evaluates samples of for , and LowerConfBnd() computes a lower bound to the success probability for obtaining successes in Bernoulli trials with confidence . Conceptually, we apply the Certify procedure introduced in CohenRK19 twice, once for and once for . If certifiably selects the certification model, we evaluate and return its prediction along with the minimum certified robustness radius of and . If certifiably selects the core model, we directly return its classification and no certificate (). If does not certifiably select either model, we either return the class that the core and certification model agree on or abstain (). A robustness radius obtained this way holds with confidence (Theorem B.1 in App. B). Note that individual tests need to be conducted with to account for multiple testing (bonferroni1936teoria). Please see App. B for a further discussion and Predict, an algorithm computing but not at a lower computational cost.
Selection Model
We can apply RS to any binary classifier to obtain a smoothed selection model . Like mueller2021certify, we consider two selection-mechanisms: i) a separate selection-network framing selection as binary classification and ii) a mechanism based on the entropy of the certification-network’s logits defined as where denotes the selection threshold. While a separate selection-network performs much better in the deterministic setting (mueller2021certify), we find that in our setting the entropy-based mechanism is even more effective (see Section D.3.2). Thus, we focus our evaluation on an entropy-based selection-mechanism. Using such a selection-mechanism allows us to evaluate Aces for a large range of , thus computing the full Pareto frontier (shown in Fig. 1), without reevaluating and . This makes the evaluation of Aces highly computationally efficient. We can even evaluate all component models separately and compute Aces certificates for arbitrary combinations retrospectively, allowing quick evaluations of new component models.
4 Experimental Evaluation

In this section, we evaluate Aces on the ImageNet and CIFAR10 datasets and demonstrate that it yields much higher average certified radii (ACR) and certified accuracies at a wide range of natural accuracies (NAC) than current state-of-the-art methods. Please see App. C for a detailed description of the experimental setup and App. D for significantly extended results, including different training methods and noise levels , showing that the effects discussed here are consistent across a wide range of settings.
NAC | ACR | Certified Accuracy at Radius r | Certified Selection Rate at Radius r | |||||||||||||
0.00 | 0.25 | 0.50 | 0.75 | 1.00 | 1.25 | 1.50 | 0.00 | 0.25 | 0.50 | 0.75 | 1.00 | 1.25 | 1.50 | |||
0.0 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.1 | 80.0 | 0.530 | 80.0 | 33.6 | 32.6 | 30.2 | 28.2 | 25.6 | 23.0 | 45.0 | 40.2 | 37.2 | 34.0 | 31.8 | 28.2 | 25.0 |
0.2 | 75.4 | 0.682 | 75.0 | 43.6 | 41.2 | 38.2 | 35.8 | 33.4 | 30.0 | 63.8 | 58.6 | 55.6 | 50.6 | 47.8 | 45.2 | 40.8 |
0.3 | 68.8 | 0.744 | 68.2 | 48.4 | 44.4 | 41.6 | 39.2 | 35.6 | 32.8 | 78.0 | 74.2 | 70.2 | 66.2 | 62.8 | 59.0 | 55.0 |
0.6 | 57.2 | 0.799 | 55.4 | 51.6 | 48.8 | 45.0 | 42.0 | 39.0 | 34.6 | 99.8 | 99.4 | 99.0 | 98.2 | 97.4 | 96.6 | 94.6 |
1.0 | 57.2 | 0.800 | 55.4 | 51.6 | 48.8 | 45.2 | 42.2 | 39.0 | 34.6 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 |
Aces on ImageNet
Fig. 1 compares the average certified radius (ACR) over natural accuracy (NAC) obtained on ImageNet by individual ResNet50 (green triangles) with those obtained by Aces (dots). We use ResNet50 with as certification-networks and either another ResNet50 (blue) or an EfficientNet-B7 (orange) as the core-network (squares) for Aces. There, the horizontal gap between the individual RS models (triangles) and Aces (orange line) corresponds to the increase in natural accuracy at the same robustness, e.g., for . We further observe that Aces already dominates the ACR of the individual models, especially at high natural accuracies, when using the small ResNet50 as core-network and even more so with the stronger EfficientNet-B7.
Table 1 shows how the certified accuracy and selection rate (ratio of samples sent to the certification-network) change with the selection threshold . Increasing from to only reduces natural accuracy by while increasing ACR from to and certified accuracy at from to . Similarly, reducing from to loses very little ACR () and certified accuracy ( at ) but yields a significant gain in natural accuracy ().
Aces on CIFAR10

Fig. 2 compares Aces (solid & dashed lines) against a baseline of varying the inference noise levels (dotted lines) with respect to the robustness accuracy trade-offs obtained on CIFAR10. Using only ResNet110, Aces models (solid lines) dominate all individual models across training noise levels (orange, blue, red). Individual models only reach comparable performance when evaluated at their training noise level. However, covering the full Pareto frontier this way would require training a very large number of networks to match a single Aces model. Using a more precise LaNet as core-network for Aces (red dashed line) significantly widens this gap.
Selection-Mechanism

In Fig. 3, we visualize the distribution of samples that can (blue) and can not (orange) be certified correctly (at ) over the certification-network’s median entropy (over perturbations). Samples to the left of a chosen threshold are assigned to the certification-network and the rest to the core-network. While separation is not perfect, we observe that there is a quick decline in the portion of certifiable samples as entropy increases, indicating that the selection-mechanism works well.
5 Conclusion
We extend compositional architectures to probabilistic robustness certification, achieving, for the first time, both high certifiable and natural accuracies on the challenging ImageNet dataset. The key component of our Aces architecture is a certified, entropy-based selection-mechanism, choosing, on a per-sample basis, whether to use a smoothed model yielding guarantees or a more accurate standard model for inference. Our experiments show that Aces yields trade-offs between robustness and accuracy that are beyond the reach of current state-of-the-art approaches while being fully orthogonal to other improvements of Randomized Smoothing.
References
Appendix A Randomized Smoothing
function Certify()
top indices in
()
if return and
else return and
|
function Predict()
top two indices in counts
if return
else return
|
In this section, we briefly explain the practical certification and inference algorithms Certify and Predict, respectively, for a smoothed classifier
as introduced by CohenRK19. We first define some components of Algorithms 2 and 3 below before we discuss them in more detail:
first samples inputs as for . Then it counts how often predicts which class for these and returns the corresponding dimensional array of counts.
returns a lower bound on the unknown probability with confidence at least such that
for the binomial distribution with parameters
and .returns the probability of at least success in Bernoulli trials with success probability .
Certification
We first recall the robustness guarantee for a smoothed classifier (Theorem 2.1):
See 2.1
Unfortunately, computing the exact probabilities is generally intractable. Thus, to allow practical application, CohenRK19 propose Certify (Algorithm 2) utilizing Monte Carlo sampling and confidence bounds: First, we draw samples to determine the majority class . Then, we draw another samples to compute a lower bound to the success probability, i.e., the probability of the underlying model to predict for a perturbed sample, with confidence via the Clopper-Pearson lemma [clopper34confidence]. If , we set and obtain radius via Theorem 2.1 with confidence , else we abstain (return ). See CohenRK19 for a proof.
Prediction
Computing a confidence bound to the success probability with Certify is computationally expensive as the number of samples is typically large. If we are only interested in computing the class predicted by the smoothed model, we can use the computationally much cheaper Predicts (Algorithm 3) proposed by CohenRK19. Instead of sampling in two separate rounds, we only draw samples once and compute the two most frequently predicted classes and with frequencies and , respectively. Subsequently, we test if the probability of obtaining success in fair Bernoulli trials is smaller than , and if so, have with confidence that the true prediction of the smoothed model is in fact . See CohenRK19 for a proof.
Training for Randomized Smoothing
To obtain high certified radii via Certify, the base model has to be trained specifically to cope with the added noise terms . To achieve this, several training methods have been introduced, which we quickly outline below.
CohenRK19 propose to use data augmentation with Gaussian noise during training. We refer to this as Gaussian. salman2019provably suggest SmoothAdv, combining adversarial training [madry2017towards, KurakinGB17, rony2019decoupling] with data augmentation ideas from Gaussian. While effective in improving accuracy, this training procedure comes with a very high computational cost. zhai2020macer propose Macer as a computationally cheaper alternative with a similar performance by adding a surrogate of the certification radius to the loss and thus more directly optimizing for large radii. jeong2020consistency build on this approach by replacing this term with a more easily optimizable one and proposing what we refer to as Consistency.
Appendix B Prediction & Certification for Aces
In this section, we recall the certification approach (Algorithm 1) and introduce the prediction approach (Algorithm 4, below) in detail for Aces as discussed in Section 3.
Certification
For an arbitrary but fixed we let denote the true output of Aces (Eq. 4) under exact evaluation of the expectations over perturbations (Eq. 1) and let
where denote the robustness radius according to Theorem 2.1 for and , respectively. We now obtain the following guarantees for the outputs of our certification algorithm Certify:
Theorem B.1.
Let denote the class and robustness radius returned by Certify (Algorithm 1) for input . Then, this output , computed via sampling, is the true output with confidence at least , if .
Proof.
First, we note that, as Certify (Algorithm 2) in CohenRK19, our Certify determines and with probability . Thus allowing us to upper bound and giving us via Theorem 2.1 and similarly .
Thus, if returns (selecting the certification network) with confidence and returns class with confidence , then we have via union bound with confidence that returns . Further, the probabilities and induce the robustness radii and , respectively, via Theorem 2.1. Thus we obtain the robustness radius as their minimum.
Should (selecting the core network), with probability we return the deterministically computed , trivially with confidence . As we only only claim robustness with in this case, the robustness statement is trivially fulfilled.
In case we can not compute the decision of with sufficient confidence, but and agree with high confidence, we return the consensus class. We again have trivially from the deterministic and the prediction of with confidence an overall confidence of that indeed . Finally, in this case we again only claim which is trivially fulfilled. ∎
Prediction
Let us again consider the setting where for an arbitrary but fixed we denotes the true output of Aces (Eq. 4) under exact evaluation of the expectations over perturbations (Eq. 1). However, now we are only interested in the predicted class and not the robustness radius. We thus introduce Predict (Algorithm 4), which is computationally much cheaper than Certify and for which we obtain the following guarantee:
Theorem B.2.
Let be the class returned by Predict (Algorithm 4) for input . Then, this output computed via sampling is the true output with confidence at least , if does not abstain.
Proof.
This proof follows analogously to that for Certify (Theorem B.1) from CohenRK19. ∎
Appendix C Experimental Setup Details
In this section, we discuss experimental details. We evaluated Aces on the ImageNet [ImageNet] and the CIFAR10 [cifar] datasets. For ImageNet, we combine ResNet50 [He_2016_CVPR] selection- and certification-networks with EfficientNet-B7 core-networks [TanL19]. For CIFAR10, we use ResNet110 [He_2016_CVPR] selection- and certification-networks, and LaNet [Wang21LaNet]
core-networks. We implement training and inference in PyTorch
[PaszkeGMLBCKLGA19] and conduct all of our experiments on single GeForce RTX 2080 Ti.As core-networks, we use pre-trained EfficientNet-B7 111https://github.com/lukemelas/EfficientNet-PyTorch/tree/master/examples/imagenet and LaNet [Wang21LaNet] for ImageNet and CIFAR10, respectively. As certification-networks, we use pre-trained ResNet50 and ResNet110 from CohenRK19 (Gaussian), salman2019provably (SmoothAdv), and zhai2020macer (Macer). Additionally, we train smoothed models with Consistency [jeong2020consistency] using the parameters reported to yield the largest ACR, except on ImageNet with where we use and (there, no parameters were reported).
We follow previous work [CohenRK19, salman2019provably] and evaluate every 20 image of the CIFAR10 test set and every 100 of the ImageNet test set [CohenRK19, jeong2020consistency], yielding 500 test samples for each. For both, we use and for certification, and for prediction (to report natural accuracy). To obtain an overall confidence of via Bonferroni correction [bonferroni1936teoria], we use to certify the selection and the certification model. To compute the entropy, we use the logarithm with basis (number of classes), such that the resulting entropies are always in . Certifying and predicting an Aces model on the 500 test samples we consider takes approximately hours on ImageNet, and hours on CIFAR10 overall, using one RTX 2080 Ti. This includes computations for a wide range () values for the selection threshold .
Appendix D Additional Experiments
In this section, we provide a significantly extended evaluation focusing on the following aspects:
In Sections D.2 and D.1, we evaluate Aces for different training methods and a range of noise levels on ImageNet and CIFAR10, respectively.
In Section D.3, we provide an in-depth analysis of the selection-mechanism, considering different measures of selection performance and both entropy-based selection and a separate selection-network.
In Section D.4, we discuss the robustness-accuracy trade-offs obtained by varying the noise level used at inference.
d.1 Additional Results on ImageNet
Training | NAC | ACR | Certified Accuracy at Radius r | |||||||||
0.0 | 0.25 | 0.5 | 0.75 | 1.0 | 1.25 | 1.5 | 1.75 | 2.0 | ||||
Gaussian | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.2 | 0.273 | 82.2 | 35.4 | 27.4 | 21.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 80.0 | 0.382 | 80.0 | 47.6 | 40.2 | 30.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 78.6 | 0.431 | 78.6 | 54.6 | 45.6 | 35.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 75.2 | 0.454 | 74.6 | 56.6 | 47.4 | 36.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 72.8 | 0.464 | 71.4 | 58.0 | 48.4 | 37.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 70.4 | 0.467 | 68.6 | 58.4 | 48.8 | 37.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 69.0 | 0.468 | 67.0 | 58.4 | 49.0 | 37.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 68.8 | 0.468 | 66.8 | 58.4 | 49.0 | 37.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 68.8 | 0.468 | 66.8 | 58.4 | 49.0 | 37.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 68.8 | 0.468 | 66.8 | 58.4 | 49.0 | 37.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
SmoothAdv | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.8 | 0.269 | 82.8 | 31.6 | 28.6 | 24.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 79.6 | 0.382 | 79.6 | 44.0 | 40.8 | 36.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 76.6 | 0.435 | 76.6 | 49.4 | 45.4 | 42.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 72.8 | 0.469 | 72.6 | 53.4 | 48.6 | 45.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 70.4 | 0.489 | 70.2 | 55.6 | 50.6 | 47.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 66.4 | 0.503 | 66.0 | 57.2 | 52.8 | 48.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 65.0 | 0.508 | 64.6 | 57.6 | 53.4 | 49.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 64.4 | 0.511 | 64.0 | 58.0 | 53.4 | 49.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 64.4 | 0.511 | 64.0 | 58.0 | 53.4 | 49.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 64.4 | 0.511 | 64.0 | 58.0 | 53.4 | 49.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
Consistency | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 80.4 | 0.390 | 80.4 | 45.0 | 41.4 | 36.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 76.2 | 0.466 | 76.2 | 53.0 | 49.0 | 44.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 71.2 | 0.492 | 71.2 | 57.0 | 52.2 | 47.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 67.8 | 0.505 | 67.2 | 58.6 | 53.6 | 48.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 63.6 | 0.508 | 63.2 | 58.8 | 53.8 | 48.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 63.6 | 0.509 | 63.0 | 58.8 | 54.0 | 48.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 63.6 | 0.509 | 63.2 | 58.8 | 54.0 | 48.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 63.8 | 0.509 | 63.2 | 58.8 | 54.0 | 48.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 63.8 | 0.509 | 63.2 | 58.8 | 54.0 | 48.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 63.8 | 0.509 | 63.2 | 58.8 | 54.0 | 48.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Training | NAC | ACR | Certified Selection Rate at Radius r | |||||||||
0.0 | 0.25 | 0.5 | 0.75 | 1.0 | 1.25 | 1.5 | 1.75 | 2.0 | ||||
Gaussian | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.2 | 0.273 | 47.0 | 37.8 | 29.0 | 22.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 80.0 | 0.382 | 66.0 | 57.0 | 48.8 | 39.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 78.6 | 0.431 | 76.6 | 70.4 | 61.4 | 53.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 75.2 | 0.454 | 86.2 | 80.0 | 72.4 | 64.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 72.8 | 0.464 | 92.4 | 87.4 | 81.8 | 75.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 70.4 | 0.467 | 97.4 | 95.4 | 91.0 | 85.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 69.0 | 0.468 | 99.8 | 99.2 | 97.2 | 95.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 68.8 | 0.468 | 100.0 | 100.0 | 99.8 | 99.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 68.8 | 0.468 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 68.8 | 0.468 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
SmoothAdv | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.8 | 0.269 | 37.8 | 34.2 | 30.8 | 26.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 79.6 | 0.382 | 56.6 | 53.0 | 48.2 | 43.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 76.6 | 0.435 | 70.2 | 66.0 | 61.6 | 56.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 72.8 | 0.469 | 80.6 | 76.4 | 71.8 | 68.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 70.4 | 0.489 | 88.8 | 84.6 | 82.0 | 79.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 66.4 | 0.503 | 95.6 | 93.6 | 91.2 | 88.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 65.0 | 0.508 | 98.4 | 98.0 | 97.6 | 96.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 64.4 | 0.511 | 100.0 | 99.8 | 99.8 | 99.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 64.4 | 0.511 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 64.4 | 0.511 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
Consistency | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 80.4 | 0.390 | 55.4 | 51.0 | 46.6 | 40.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 76.2 | 0.466 | 72.4 | 67.8 | 61.0 | 57.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 71.2 | 0.492 | 86.0 | 80.4 | 75.2 | 71.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 67.8 | 0.505 | 93.0 | 89.8 | 87.2 | 82.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 63.6 | 0.508 | 98.4 | 96.4 | 94.2 | 91.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 63.6 | 0.509 | 99.8 | 99.2 | 98.8 | 98.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 63.6 | 0.509 | 100.0 | 99.8 | 99.8 | 99.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 63.8 | 0.509 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 63.8 | 0.509 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 63.8 | 0.509 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Training | NAC | ACR | Certified Accuracy at Radius r | |||||||||
0.0 | 0.25 | 0.5 | 0.75 | 1.0 | 1.25 | 1.5 | 1.75 | 2.0 | ||||
Gaussian | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.4 | 0.380 | 82.4 | 29.2 | 25.6 | 23.2 | 20.0 | 16.4 | 13.4 | 9.8 | 0.0 | |
0.20 | 78.6 | 0.536 | 78.4 | 38.8 | 33.4 | 30.6 | 28.6 | 24.4 | 21.2 | 16.4 | 0.0 | |
0.30 | 74.2 | 0.619 | 73.4 | 44.0 | 40.2 | 35.2 | 31.4 | 29.0 | 24.6 | 19.6 | 0.0 | |
0.40 | 70.8 | 0.665 | 69.4 | 47.8 | 43.0 | 38.2 | 33.2 | 30.6 | 26.2 | 20.2 | 0.0 | |
0.50 | 65.8 | 0.693 | 64.4 | 50.2 | 44.4 | 40.6 | 35.4 | 31.8 | 27.0 | 20.8 | 0.0 | |
0.60 | 62.4 | 0.712 | 60.6 | 51.0 | 45.6 | 42.0 | 36.8 | 33.0 | 27.8 | 21.2 | 0.0 | |
0.70 | 59.8 | 0.716 | 57.4 | 51.4 | 45.6 | 42.4 | 37.2 | 33.0 | 28.2 | 21.4 | 0.0 | |
0.80 | 60.0 | 0.717 | 57.4 | 51.6 | 45.8 | 42.4 | 37.2 | 33.0 | 28.2 | 21.4 | 0.0 | |
0.90 | 59.8 | 0.717 | 57.2 | 51.6 | 45.8 | 42.4 | 37.2 | 33.0 | 28.2 | 21.4 | 0.0 | |
1.00 | 59.8 | 0.717 | 57.2 | 51.6 | 45.8 | 42.4 | 37.2 | 33.0 | 28.2 | 21.4 | 0.0 | |
SmoothAdv | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 83.2 | 0.308 | 83.2 | 20.2 | 18.2 | 17.2 | 16.4 | 14.8 | 13.0 | 11.4 | 0.0 | |
0.20 | 81.0 | 0.486 | 81.2 | 31.6 | 29.6 | 26.8 | 24.4 | 23.6 | 21.2 | 19.6 | 0.0 | |
0.30 | 76.8 | 0.592 | 77.0 | 37.6 | 35.2 | 33.2 | 31.2 | 28.8 | 26.8 | 24.0 | 0.0 | |
0.40 | 73.2 | 0.661 | 73.4 | 42.2 | 39.6 | 36.6 | 34.2 | 31.8 | 29.8 | 27.8 | 0.0 | |
0.50 | 68.2 | 0.716 | 68.4 | 46.2 | 43.0 | 39.4 | 36.8 | 34.0 | 32.0 | 30.2 | 0.0 | |
0.60 | 63.4 | 0.765 | 63.2 | 49.8 | 46.0 | 42.2 | 39.6 | 36.2 | 34.2 | 31.0 | 0.0 | |
0.70 | 57.8 | 0.791 | 57.4 | 51.4 | 47.4 | 43.4 | 41.0 | 37.8 | 35.4 | 32.2 | 0.0 | |
0.80 | 55.6 | 0.806 | 55.0 | 52.4 | 48.6 | 44.2 | 41.8 | 38.6 | 35.6 | 32.8 | 0.0 | |
0.90 | 55.6 | 0.809 | 55.0 | 52.6 | 48.8 | 44.4 | 42.2 | 38.8 | 35.6 | 32.8 | 0.0 | |
1.00 | 55.6 | 0.809 | 55.0 | 52.6 | 48.8 | 44.4 | 42.2 | 38.8 | 35.6 | 32.8 | 0.0 | |
Consistency | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 80.0 | 0.530 | 80.0 | 33.6 | 32.6 | 30.2 | 28.2 | 25.6 | 23.0 | 19.6 | 0.0 | |
0.20 | 75.4 | 0.682 | 75.0 | 43.6 | 41.2 | 38.2 | 35.8 | 33.4 | 30.0 | 27.0 | 0.0 | |
0.30 | 68.8 | 0.744 | 68.2 | 48.4 | 44.4 | 41.6 | 39.2 | 35.6 | 32.8 | 29.2 | 0.0 | |
0.40 | 62.4 | 0.777 | 61.6 | 50.2 | 47.6 | 43.8 | 40.2 | 37.4 | 33.4 | 30.8 | 0.0 | |
0.50 | 59.2 | 0.795 | 57.6 | 51.4 | 48.2 | 45.0 | 41.8 | 38.6 | 34.4 | 30.8 | 0.0 | |
0.60 | 57.2 | 0.799 | 55.4 | 51.6 | 48.8 | 45.0 | 42.0 | 39.0 | 34.6 | 31.0 | 0.0 | |
0.70 | 57.2 | 0.800 | 55.4 | 51.6 | 48.8 | 45.2 | 42.2 | 39.0 | 34.6 | 31.0 | 0.0 | |
0.80 | 57.2 | 0.800 | 55.4 | 51.6 | 48.8 | 45.2 | 42.2 | 39.0 | 34.6 | 31.0 | 0.0 | |
0.90 | 57.2 | 0.800 | 55.4 | 51.6 | 48.8 | 45.2 | 42.2 | 39.0 | 34.6 | 31.0 | 0.0 | |
1.00 | 57.2 | 0.800 | 55.4 | 51.6 | 48.8 | 45.2 | 42.2 | 39.0 | 34.6 | 31.0 | 0.0 |
Training | NAC | ACR | Certified Selection Rate at Radius r | |||||||||
0.0 | 0.25 | 0.5 | 0.75 | 1.0 | 1.25 | 1.5 | 1.75 | 2.0 | ||||
Gaussian | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.4 | 0.380 | 35.0 | 31.6 | 27.6 | 24.2 | 20.6 | 16.8 | 13.8 | 10.2 | 0.0 | |
0.20 | 78.6 | 0.536 | 53.2 | 47.8 | 41.6 | 37.2 | 35.6 | 30.8 | 27.2 | 23.4 | 0.0 | |
0.30 | 74.2 | 0.619 | 68.2 | 61.8 | 57.2 | 49.6 | 45.2 | 41.2 | 37.6 | 32.2 | 0.0 | |
0.40 | 70.8 | 0.665 | 78.2 | 73.0 | 69.4 | 62.8 | 58.2 | 53.2 | 47.0 | 40.6 | 0.0 | |
0.50 | 65.8 | 0.693 | 88.4 | 83.8 | 79.8 | 74.4 | 71.0 | 65.0 | 59.2 | 51.8 | 0.0 | |
0.60 | 62.4 | 0.712 | 94.6 | 91.6 | 89.6 | 86.2 | 82.0 | 78.8 | 74.2 | 65.4 | 0.0 | |
0.70 | 59.8 | 0.716 | 99.2 | 97.6 | 95.8 | 94.4 | 92.2 | 90.2 | 88.0 | 82.0 | 0.0 | |
0.80 | 60.0 | 0.717 | 99.8 | 99.8 | 99.6 | 99.6 | 99.6 | 99.4 | 97.8 | 96.6 | 0.0 | |
0.90 | 59.8 | 0.717 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.8 | 99.8 | 99.8 | 0.0 | |
1.00 | 59.8 | 0.717 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | |
SmoothAdv | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 83.2 | 0.308 | 24.6 | 21.8 | 19.8 | 18.6 | 17.6 | 15.8 | 14.0 | 12.4 | 0.0 | |
0.20 | 81.0 | 0.486 | 40.6 | 38.0 | 35.4 | 32.4 | 30.2 | 29.0 | 26.6 | 25.2 | 0.0 | |
0.30 | 76.8 | 0.592 | 52.4 | 50.4 | 48.0 | 45.4 | 43.8 | 41.2 | 37.4 | 34.6 | 0.0 | |
0.40 | 73.2 | 0.661 | 62.8 | 59.8 | 57.8 | 56.2 | 53.4 | 51.6 | 49.6 | 46.8 | 0.0 | |
0.50 | 68.2 | 0.716 | 73.4 | 71.4 | 69.2 | 66.8 | 64.2 | 61.2 | 59.2 | 56.8 | 0.0 | |
0.60 | 63.4 | 0.765 | 86.4 | 83.6 | 81.6 | 78.2 | 76.6 | 74.8 | 71.8 | 69.0 | 0.0 | |
0.70 | 57.8 | 0.791 | 95.4 | 94.0 | 92.2 | 90.4 | 89.6 | 87.4 | 85.2 | 83.8 | 0.0 | |
0.80 | 55.6 | 0.806 | 99.6 | 99.0 | 98.8 | 98.8 | 98.4 | 98.0 | 97.0 | 95.6 | 0.0 | |
0.90 | 55.6 | 0.809 | 100.0 | 100.0 | 100.0 | 100.0 | 99.8 | 99.8 | 99.8 | 99.8 | 0.0 | |
1.00 | 55.6 | 0.809 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | |
Consistency | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 80.0 | 0.530 | 45.0 | 40.2 | 37.2 | 34.0 | 31.8 | 28.2 | 25.0 | 21.0 | 0.0 | |
0.20 | 75.4 | 0.682 | 63.8 | 58.6 | 55.6 | 50.6 | 47.8 | 45.2 | 40.8 | 36.4 | 0.0 | |
0.30 | 68.8 | 0.744 | 78.0 | 74.2 | 70.2 | 66.2 | 62.8 | 59.0 | 55.0 | 50.6 | 0.0 | |
0.40 | 62.4 | 0.777 | 90.2 | 85.4 | 82.8 | 80.2 | 76.4 | 72.6 | 67.0 | 63.6 | 0.0 | |
0.50 | 59.2 | 0.795 | 96.8 | 95.2 | 93.2 | 90.8 | 88.0 | 84.4 | 81.8 | 78.4 | 0.0 | |
0.60 | 57.2 | 0.799 | 99.8 | 99.4 | 99.0 | 98.2 | 97.4 | 96.6 | 94.6 | 91.0 | 0.0 | |
0.70 | 57.2 | 0.800 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.8 | 99.0 | 0.0 | |
0.80 | 57.2 | 0.800 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | |
0.90 | 57.2 | 0.800 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | |
1.00 | 57.2 | 0.800 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 |
Training | NAC | ACR | Certified Accuracy at Radius r | |||||||||
0.0 | 0.5 | 1.0 | 1.5 | 2.0 | 2.5 | 3.0 | 3.5 | 4.0 | ||||
Gaussian | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 83.0 | 0.322 | 83.0 | 15.0 | 11.8 | 9.4 | 7.6 | 5.2 | 4.2 | 3.2 | 0.0 | |
0.20 | 80.6 | 0.513 | 80.6 | 22.8 | 18.2 | 15.2 | 11.8 | 9.4 | 7.4 | 4.6 | 0.0 | |
0.30 | 75.8 | 0.650 | 75.2 | 28.6 | 23.6 | 19.4 | 14.4 | 11.2 | 9.6 | 6.2 | 0.0 | |
0.40 | 71.0 | 0.741 | 70.4 | 32.4 | 27.6 | 22.2 | 16.8 | 12.6 | 10.4 | 7.4 | 0.0 | |
0.50 | 64.2 | 0.801 | 62.4 | 34.8 | 29.6 | 24.4 | 18.6 | 13.8 | 11.6 | 8.2 | 0.0 | |
0.60 | 56.4 | 0.846 | 54.6 | 37.2 | 31.6 | 25.4 | 19.0 | 14.6 | 12.0 | 8.6 | 0.0 | |
0.70 | 50.0 | 0.860 | 46.8 | 37.8 | 32.6 | 25.4 | 19.2 | 14.6 | 12.0 | 8.8 | 0.0 | |
0.80 | 47.6 | 0.862 | 43.8 | 37.8 | 32.6 | 25.8 | 19.4 | 14.6 | 12.0 | 8.8 | 0.0 | |
0.90 | 47.4 | 0.862 | 43.6 | 37.8 | 32.6 | 25.8 | 19.4 | 14.6 | 12.0 | 8.8 | 0.0 | |
1.00 | 47.4 | 0.862 | 43.6 | 37.8 | 32.6 | 25.8 | 19.4 | 14.6 | 12.0 | 8.8 | 0.0 | |
SmoothAdv | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 83.4 | 0.254 | 83.4 | 9.2 | 8.4 | 7.2 | 6.0 | 5.8 | 4.6 | 4.6 | 0.0 | |
0.20 | 82.2 | 0.407 | 82.2 | 14.2 | 12.2 | 11.4 | 10.0 | 9.2 | 8.8 | 7.6 | 0.0 | |
0.30 | 79.8 | 0.541 | 79.6 | 18.4 | 17.0 | 16.0 | 14.2 | 12.0 | 10.6 | 9.8 | 0.0 | |
0.40 | 76.8 | 0.653 | 76.6 | 22.2 | 20.6 | 19.0 | 16.8 | 15.2 | 13.4 | 11.6 | 0.0 | |
0.50 | 70.8 | 0.755 | 70.6 | 26.4 | 23.8 | 21.8 | 19.2 | 17.0 | 15.4 | 13.2 | 0.0 | |
0.60 | 64.2 | 0.854 | 63.6 | 30.0 | 27.4 | 24.8 | 22.4 | 18.6 | 16.8 | 14.6 | 0.0 | |
0.70 | 53.2 | 0.933 | 52.6 | 32.2 | 30.2 | 27.2 | 23.8 | 20.4 | 19.2 | 16.0 | 0.0 | |
0.80 | 44.0 | 0.985 | 43.4 | 34.6 | 31.2 | 28.6 | 25.2 | 21.8 | 19.8 | 16.6 | 0.0 | |
0.90 | 39.8 | 0.999 | 39.2 | 35.2 | 32.0 | 29.2 | 25.6 | 22.0 | 19.8 | 16.6 | 0.0 | |
1.00 | 39.8 | 0.999 | 39.2 | 35.2 | 32.0 | 29.2 | 25.6 | 22.0 | 19.8 | 16.6 | 0.0 | |
Consistency | 0.00 | 83.4 | 0.000 | 83.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.8 | 0.375 | 82.8 | 14.0 | 12.6 | 11.0 | 9.6 | 8.6 | 6.8 | 5.6 | 0.0 | |
0.20 | 81.8 | 0.559 | 81.8 | 22.2 | 19.2 | 15.4 | 13.2 | 11.8 | 10.8 | 8.0 | 0.0 | |
0.30 | 78.4 | 0.698 | 77.8 | 27.6 | 24.0 | 20.2 | 17.6 | 14.6 | 12.4 | 10.2 | 0.0 | |
0.40 | 72.8 | 0.800 | 72.4 | 31.0 | 28.2 | 23.8 | 20.2 | 18.0 | 14.0 | 10.6 | 0.0 | |
0.50 | 66.8 | 0.881 | 66.2 | 34.0 | 30.4 | 25.8 | 22.6 | 20.4 | 14.4 | 11.6 | 0.0 | |
0.60 | 57.8 | 0.941 | 56.8 | 36.8 | 32.6 | 27.6 | 23.6 | 21.6 | 15.8 | 11.6 | 0.0 | |
0.70 | 51.8 | 0.979 | 50.6 | 39.0 | 34.0 | 28.4 | 24.0 | 22.0 | 16.6 | 12.0 | 0.0 | |
0.80 | 46.0 | 0.996 | 44.6 | 39.4 | 35.0 | 29.4 | 24.4 | 22.0 | 16.6 | 12.0 | 0.0 | |
0.90 | 45.0 | 0.997 | 43.2 | 39.6 | 35.0 | 29.4 | 24.4 | 22.0 | 16.6 | 12.0 | 0.0 | |
1.00 | 45.0 | 0.997 | 43.2 | 39.6 | 35.0 | 29.4 | 24.4 | 22.0 | 16.6 | 12.0 | 0.0 |
Training | NAC | ACR | Certified Selection Rate at Radius r | |||||||||
0.0 | 0.5 | 1.0 | 1.5 | 2.0 | 2.5 | 3.0 | 3.5 | 4.0 | ||||
Gaussian | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 83.0 | 0.322 | 19.8 | 16.4 | 12.8 | 10.2 | 8.2 | 5.6 | 4.4 | 3.4 | 0.0 | |
0.20 | 80.6 | 0.513 | 34.2 | 28.4 | 22.6 | 19.2 | 16.0 | 13.4 | 10.0 | 7.0 | 0.0 | |
0.30 | 75.8 | 0.650 | 48.0 | 41.0 | 34.4 | 29.0 | 23.8 | 19.4 | 15.4 | 11.2 | 0.0 | |
0.40 | 71.0 | 0.741 | 60.2 | 53.0 | 47.0 | 40.8 | 34.6 | 28.6 | 23.6 | 16.8 | 0.0 | |
0.50 | 64.2 | 0.801 | 73.8 | 65.4 | 57.8 | 52.8 | 47.8 | 40.4 | 33.8 | 24.6 | 0.0 | |
0.60 | 56.4 | 0.846 | 85.6 | 79.6 | 73.4 | 66.0 | 59.8 | 53.2 | 47.8 | 37.2 | 0.0 | |
0.70 | 50.0 | 0.860 | 95.8 | 93.0 | 89.8 | 84.6 | 78.0 | 70.8 | 64.6 | 55.8 | 0.0 | |
0.80 | 47.6 | 0.862 | 99.8 | 99.6 | 99.4 | 97.4 | 95.8 | 93.6 | 89.6 | 82.0 | 0.0 | |
0.90 | 47.4 | 0.862 | 100.0 | 100.0 | 100.0 | 99.8 | 99.8 | 99.8 | 99.8 | 99.6 | 0.0 | |
1.00 | 47.4 | 0.862 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | |
SmoothAdv | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 83.4 | 0.254 | 11.8 | 10.4 | 9.4 | 8.2 | 6.8 | 6.2 | 5.0 | 5.0 | 0.0 | |
0.20 | 82.2 | 0.407 | 20.0 | 18.0 | 16.0 | 14.4 | 13.2 | 12.2 | 11.2 | 9.8 | 0.0 | |
0.30 | 79.8 | 0.541 | 28.2 | 25.2 | 23.4 | 21.8 | 20.0 | 16.6 | 14.2 | 13.0 | 0.0 | |
0.40 | 76.8 | 0.653 | 36.4 | 33.2 | 31.4 | 28.8 | 27.0 | 25.2 | 22.6 | 19.4 | 0.0 | |
0.50 | 70.8 | 0.755 | 49.0 | 45.0 | 42.0 | 38.8 | 34.6 | 32.6 | 29.4 | 26.4 | 0.0 | |
0.60 | 64.2 | 0.854 | 62.4 | 58.2 | 54.2 | 51.2 | 48.4 | 44.8 | 40.2 | 36.4 | 0.0 | |
0.70 | 53.2 | 0.933 | 77.6 | 75.2 | 72.2 | 67.4 | 64.0 | 61.0 | 57.2 | 52.4 | 0.0 | |
0.80 | 44.0 | 0.985 | 94.0 | 92.4 | 91.2 | 89.0 | 86.4 | 83.0 | 79.0 | 75.0 | 0.0 | |
0.90 | 39.8 | 0.999 | 100.0 | 99.6 | 99.2 | 98.8 | 98.8 | 98.8 | 98.2 | 97.4 | 0.0 | |
1.00 | 39.8 | 0.999 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 | |
Consistency | 0.00 | 83.4 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 82.8 | 0.375 | 19.4 | 16.2 | 14.8 | 13.0 | 11.0 | 9.8 | 7.6 | 6.4 | 0.0 | |
0.20 | 81.8 | 0.559 | 32.2 | 27.0 | 24.4 | 21.4 | 18.8 | 16.4 | 14.6 | 11.6 | 0.0 | |
0.30 | 78.4 | 0.698 | 41.4 | 39.0 | 34.8 | 31.6 | 27.2 | 23.4 | 20.2 | 16.8 | 0.0 | |
0.40 | 72.8 | 0.800 | 51.8 | 47.8 | 43.6 | 40.6 | 36.4 | 33.0 | 28.8 | 24.0 | 0.0 | |
0.50 | 66.8 | 0.881 | 63.6 | 57.4 | 53.0 | 50.0 | 46.8 | 42.4 | 37.6 | 32.2 | 0.0 | |
0.60 | 57.8 | 0.941 | 79.2 | 74.2 | 70.2 | 64.2 | 58.6 | 53.4 | 48.8 | 45.6 | 0.0 | |
0.70 | 51.8 | 0.979 | 90.4 | 87.4 | 84.0 | 80.2 | 75.4 | 71.2 | 67.4 | 60.4 | 0.0 | |
0.80 | 46.0 | 0.996 | 97.6 | 96.8 | 96.2 | 95.2 | 93.6 | 90.6 | 88.2 | 83.8 | 0.0 | |
0.90 | 45.0 | 0.997 | 100.0 | 100.0 | 100.0 | 100.0 | 99.8 | 99.8 | 99.4 | 98.8 | 0.0 | |
1.00 | 45.0 | 0.997 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 0.0 |
![]() |
![]() |
![]() |
In this section, we evaluate Aces on ImageNet for a wide range of training methods (Gaussian, SmoothAdv, and Consistency) and noise levels . In particular, we provide detailed results on the certified accuracies obtained by Aces in Table 2 and the corresponding certified selection rates in Table 3 for . Similarly, Tables 5 and 4 and Tables 7 and 6 contain results for and , respectively.
In Fig. 4, we visualize the trade-off between natural and certified accuracy at fixed radii for Aces (blue and orange dots) and individual smoothed models (green triangles). We observe that Aces achieves significant certified accuracies at natural accuracies not achievable at all by conventional smoothed models.
For example, the highest natural accuracy () obtained by one of the Consistency smoothed models requires , leading to a certified accuracy of at radius . Aces, in contrast, can use a certification-network with to, e.g., obtain a similar natural accuracy of and a much higher certified accuracy of .
d.2 Additional Results on CIFAR10
Training | NAC | ACR | Certified Accuracy at Radius r | |||||||||
0.0 | 0.25 | 0.5 | 0.75 | 1.0 | 1.25 | 1.5 | 1.75 | 2.00 | ||||
Gaussian | 0.00 | 99.0 | 0.000 | 99.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 98.0 | 0.189 | 98.0 | 28.8 | 18.4 | 9.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 96.4 | 0.247 | 96.6 | 35.8 | 24.8 | 14.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 94.6 | 0.303 | 94.8 | 41.6 | 29.6 | 19.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 90.4 | 0.358 | 90.6 | 49.8 | 35.8 | 22.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 85.4 | 0.397 | 85.2 | 56.0 | 40.4 | 24.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 81.6 | 0.416 | 79.8 | 59.4 | 42.2 | 25.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 78.2 | 0.421 | 76.0 | 60.0 | 42.8 | 25.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 77.8 | 0.422 | 75.4 | 60.0 | 42.8 | 25.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 77.8 | 0.422 | 75.4 | 60.0 | 42.8 | 25.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 77.8 | 0.422 | 75.4 | 60.0 | 42.8 | 25.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
SmoothAdv | 0.00 | 99.0 | 0.000 | 99.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 98.8 | 0.161 | 98.8 | 19.6 | 17.4 | 13.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 98.4 | 0.222 | 98.4 | 27.6 | 22.6 | 18.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 97.4 | 0.288 | 97.4 | 35.4 | 29.4 | 24.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 94.8 | 0.352 | 94.6 | 43.0 | 37.2 | 29.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 92.4 | 0.414 | 92.2 | 50.2 | 43.4 | 36.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.60 | 88.0 | 0.470 | 88.0 | 55.2 | 50.2 | 41.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.70 | 80.8 | 0.515 | 80.2 | 62.4 | 53.6 | 45.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.80 | 76.2 | 0.538 | 75.4 | 65.8 | 55.8 | 46.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.90 | 74.2 | 0.544 | 73.4 | 66.8 | 57.2 | 47.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
1.00 | 74.2 | 0.544 | 73.4 | 66.8 | 57.2 | 47.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
Macer | 0.00 | 99.0 | 0.000 | 99.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
0.10 | 95.8 | 0.328 | 96.0 | 43.6 | 33.2 | 23.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.20 | 92.8 | 0.389 | 92.6 | 51.0 | 39.4 | 29.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.30 | 90.2 | 0.438 | 90.0 | 56.4 | 43.8 | 33.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.40 | 87.0 | 0.481 | 86.4 | 62.8 | 48.6 | 37.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
0.50 | 82.2 | 0.504 | 81.4 | 67.6 | 51.4 | 38.0 | 0.0 | 0.0 | 0.0 | 0.0 |