Log In Sign Up

Evolving Robust Neural Architectures to Defend from Adversarial Attacks

Deep neural networks were shown to misclassify slightly modified input images. Recently, many defenses have been proposed but none have improved consistently the robustness of neural networks. Here, we propose to use attacks as a function evaluation to automatically search for architectures that can resist such attacks. Experiments on neural architecture search algorithms from the literature show that although their accurate results, they are not able to find robust architectures. Most of the reason for this lies in their limited search space. By creating a novel neural architecture search with options for dense layers to connect with convolution layers and vice-versa as well as the addition of multiplication, addition and concatenation layers in the search space, we were able to evolve an architecture that is 58% accurate on adversarial samples. Interestingly, this inherent robustness of the evolved architecture rivals state-of-the-art defenses such as adversarial training while being trained only on the training dataset. Moreover, the evolved architecture makes use of some peculiar traits which might be useful for developing even more robust ones. Thus, the results here demonstrate that more robust architectures exist as well as opens up a new range of possibilities for the development and exploration of deep neural networks using automatic architecture search. Code available at


page 13

page 14


NeuralArTS: Structuring Neural Architecture Search with Type Theory

Neural Architecture Search (NAS) algorithms automate the task of finding...

Searching for Robust Neural Architectures via Comprehensive and Reliable Evaluation

Neural architecture search (NAS) could help search for robust network ar...

Neural Architecture Search From Fréchet Task Distance

We formulate a Fréchet-type asymmetric distance between tasks based on F...

Exploring Robustness of Neural Networks through Graph Measures

Motivated by graph theory, artificial neural networks (ANNs) are traditi...

ATRAS: Adversarially Trained Robust Architecture Search

In this paper, we explore the effect of architecture completeness on adv...

Code Repositories


This github repository contains the official code for the paper, "Evolving Robust Neural Architectures to Defend from Adversarial Attacks"

view repo

1 Introduction

Automatic architecture search (AAS) and adversarial samples have rarely appeared together. Regarding adversarial samples, they were discovered in 2013 when DNNs were shown to behave strangely for nearly the same images 3 . Afterwards, a series of vulnerabilities were found 7 ,moosavi2017universal ,brown2017adversarial ,su2017one . Such attacks can also be easily applied to real world scenarios kurakin2016adversarial ,athalye2017synthesizing which confers a big problem for current deep neural networks’ applications. Currently, there is not any known learning algorithm or procedure that can defend against adversarial attacks consistently.

Regarding AAS, the automatic design of architectures has being of wide interest for many years. The aim is to develop methods that do not need specialists in order to be applied to a different application. This would confer not only generality but also easy of use. Most of the algorithms for AAS are either based on reinforcement learning

zoph2016neural ,zoph2018learning ,pham2018efficient , baker2016designing

or evolutionary computation

xie2017genetic , miikkulainen2019evolving , real2017large ,liu2017hierarchical , real2018regularized . On the one hand, in reinforcement learning approaches, architectures are created from a sequence of actions which are afterwards rewarded proportionally to the crafted architecture’s accuracy. On the other hand, in evolutionary computation based methods, small changes in the architecture (mutations) and recombinations (crossover) are used to create new architectures. All architectures are evaluated based on their accuracy with some of the best architectures chosen to continue to the next generation.

Here we propose the use of AAS to tackle the robustness issues exposed by adversarial samples. In other words, architecture search will be employed not to find accurate neural networks but robust ones. This is based on the principle that robustness of neural networks can be evaluated by using adversarial attacks as function evaluation. We hypothesize that if there is a solution in a giving architecture search space, the search algorithm would be able to find it. This is not only a blind search for a cure. The best architectures found should also hint which structures and procedures provides robustness for neural networks. Therefore, it would be possible to use the results of the search to further understand how to improve the representation of models as well as design yet more robust ones.

2 Adversarial Machine Learning

Adversarial machine learning is a constrained optimization problem. Let be the output of a machine learning algorithm in which is the input of the algorithm for input and output of sizes (images with three channels are considered) and respectively. Adversarial samples x’ can be defined as follows:


in which is a small perturbation added to the input.

Therefore, adversarial machine learning can be defined as an optimization problem111Here the definition will only concern untargeted attacks but a similar optimization problem can be defined for targeted attacks:

subject to

where and are respectively the soft-label for the correct class and a threshold value. Moreover, attacks can be divided according to the function optimized. In this way, there are (limited number of pixels attacked), , and (limited amount of variation in each pixel) types of attacks.

There are many types of attacks as well as their improvements. Universal perturbation types of attacks were shown possible in which a single perturbation added to most of the samples is capable of fooling a DNN in most of the cases moosavi2017universal . Image patches are also able to make a DNN misclassify brown2017adversarial . Moreover, extreme attacks such as only modifying one pixel () called one pixel attack is also shown to be surprisingly effective su2017one . In fact, most of these attacks can be easily transferred to real scenarios by using printed out versions of them kurakin2016adversarial . Moreover, carefully crafted glasses sharif2016accessorize or even general 3d adversarial objects are also capable of causing misclassification athalye2017synthesizing . Regarding understanding the phenomena, in goodfellow2014explaining it is argued that DNNs’ linearity are one of the main reasons. Another recent investigation proposes the conflicting saliency added by adversarial samples as the reason for misclassification vargas2019understanding .

Many defensive systems were proposed to mitigate some of the problems. However, current solutions are still far from solving the problems. Defensive distillation uses a smaller neural network to learn the content from the original one

papernot2016distillation however it was shown to not be robust enough carlini2017towards . The addition of adversarial samples to the training dataset, called adversarial training, was also proposed goodfellow2014explaining ,huang2015learning , madry2017towards . However, adversarial training has a strong bias to the type of adversarial samples used and is still vulnerable to attacks tramer2017ensemble . Many recent variations of defenses were proposed ma2018characterizing , guo2017countering song2017pixeldefend , which are carefully analyzed and many of their shortcomings explained in athalye2018obfuscated ,uesato2018adversarial .

In this paper, different from previous approaches, we aim to tackle the robustness problems of DNNs by automatically searching for inherent robust architectures.

3 Architecture Search

There are three components to a neural architecture search: search space, search strategy and performance estimation strategy. A search space essentially limits the representation of the architecture in a given space. To find architectures in a defined search space, a search strategy must be employed to explore it. Some widely used search strategies for AAS are: Random Search, Bayesian Optimization

kandasamy2018neural , Evolutionary Methods real2017large ,xie2017genetic ,liu2017hierarchical ,real2018regularized , Reinforcement Learning zoph2016neural , baker2016designing , cai2018efficient , zhong2018practical ,cai2018path , pham2018efficient and Gradient Based Methods brock2017smash , luo2018neural , liu2018darts . Finally, a performance estimation (usually error rate) is required to evaluate the explored architectures.

Currently, most of the current AAS suffer from high computational cost while searching in a relatively small search space zoph2016neural ,liu2017hierarchical ,real2017large . Moreover, many architecture searches focus primarily on the hyper-parameter search while using architecture search spaces around previously hand-crafted architecture such as DenseNet cai2018path which are proved to be vulnerable to adversarial attacks. Therefore, for robustness architectures to be found, it is important to expand the search space beyond current AAS.

Smash brock2017smash uses a neural network to generate the weights of the main model. The main strength of this approach lies in preventing high computational cost which is incurred in other searches. However, this comes at the cost of not being able of tweaking hyper-parameters which affect weights like initialisers and regularisers. Deep Architect negrinho2017deeparchitect follows a hierarchical approach using various search algorithms such as Monte Carlo Tree Search and Sequential Model based Global Optimization (SMBO).

4 Searching for Robust Architectures

To search for robust architectures, a robust evaluation (defined in Section 4.1) and search algorithm must be defined. The search algorithm may be an AAS from the architecture provided that some modifications are made (Section 4.2). However, to allow for a wider search space, which is better suited to the problem, we also propose the Robust Architecture Search (Section 4.3).

4.1 Robustness Evaluation

Model Optimizer Attack Attack Total
1 3 5 10 1 3 5 10
Capsnet DE 18 46 45 47 05 09 12 24 206
Capsnet CMAES 14 34 45 62 09 38 74 98 374
AT DE 23 59 63 66 00 02 03 06 222
AT CMAES 20 50 70 82 03 12 25 57 319
Resnet DE 23 66 75 77 06 22 46 78 393
Resnet CMAES 11 49 63 77 28 72 75 83 458
FS DE 21 73 78 78 04 21 45 78 398
FS CMAES 17 49 69 78 26 63 66 74 442
Total 147 426 508 567 81 239 346 498 2812
Table 1: Number of samples used from each type of attack to compose the adversarial samples. Based on the principle of the transferability of adversarial samples, these adversarial samples are used as a fast attack for the robustness evaluation of architectures. Details of the attacks are explained at vargas2019model .

To evaluate the robustness of architectures we use a transferable type of attack. In other words, adversarial samples previously found by attacking other methods are stored and used as possible adversarial samples to the current model under evaluation. This solves the problem that most of the attacks are usually slow to be put inside a loop which can make the search for architectures too expensive. Specifically, we use the adversarial samples from two types of attacks ( and attacks) with two optimization algorithms (Covariance matrix adaptation evolution strategy (CMAES) hansen2003reducing and differential evolution (DE) storn1997differential ) over ResNet he2016deep and CapsNet sabour2017dynamic models as well as adversarial training (AT) madry2017towards and feature squeezing (FS) xu2017feature defenses. Table 1 shows a summary of the number of images used from each type of attack, totaling adversarial samples. Attacks were done using the model agnostic dual quality assessment vargas2019model .

The evaluation procedure consists of calculating the amount of successful adversarial samples divided by the total of possible adversarial samples.

This also avoids problems with different amount of perturbation necessary for attacks to succeed which could cause incomparable results.

4.2 Robust Search Conversion of AAS

By changing the fitness function (in the case of evolutionary computation based AAS) or the reward function (in the case of reinforcement learning based AAS) it is possible to create robust search versions of AAS algorithms. In other words, it is possible to convert the search for accuracy into search for robustness and accuracy.

Here we use Smash and DeepArchitect for the tests. The reason for the choice lies in the difference of the methods and availability of the code. Both methods have their evaluation function modified to contain not only accuracy but also robustness (Section 4.1).

4.3 Robust Architecture Search (RAS)

Here, we propose an evolutionary algorithm to search for robust architectures called Robust Architecture Search (RAS). If we consider that some of the most robust architectures might be unusual combinations not yet found or deeply explored. It makes sense to focus here on search spaces that allow for such unusual layer types and their combinations to happen.

For such a huge search space to be more efficiently searched we propose to use three subpopulations, allowing for the reuse of blocks and layers. Specifically, the layers consist of:

  • Layer Population: raw layers (convolutional and fully connected) which make up the blocks.

  • Block Population: blocks which are a combination of layers.

  • Model Population: a population of architectures which consists of interconnected blocks.

Figure 1 illustrates the architecture.

Figure 1: Illustration of the proposed RAS structure with three subpopulations.

RAS Overview

RAS works by creating three initial populations (layer, block and model populations). Every generation, the model population have each of its members modified five times by mutations. The modified members are added to the population as new members. Here we propose a utility evaluation in which layer and block populations are evaluated by the number of models (architectures) using them. Models are evaluated by their accuracy and attack resilience (accuracy on adversarial samples). At the end step of each generation, all blocks and layers which are not used by any of the current members of the model population are removed. Moreover, architectures compete with similar ones in their own subpopulation, such that only the fittest of each subpopulation survives.

The initial population consists of random architectures which contain blocks made up of layers, in which is a uniform random distribution with minimum and maximum values. The possible available parameters for the layers are as follows: for convolutional layers, filter size might be , , or

, stride size may be

or and kernel size is either , or ; for fully connected layers, the unit size may assume the values of , , or

. All the layers use Rectified Linear Unit (ReLU) as activation function and are followed by a batch-normalization layer.

Fitness of an individual of the model population is measured using the final validation accuracy of the model after training for a maximum of epochs with early stopping if accuracy or validation accuracy do not change more than in the span of epochs. Regarding the fitness calculation, the fitness is calculated as the accuracy of the model plus the robustness of the model (). The robustness of the architecture is calculated as described in Section 4.1. The accuracy of the architecture is calculated after the model is trained over the whole set of samples ( samples) of the CIFAR’s training dataset (for every generations) or over random samples of the CIFAR training dataset (for all other generations). This allows an efficient evolution to happen in which blocks and layers evolve at a faster rate without interfering with the architecture’s accuracy.

To keep a high amount of diversity while searching in a huge search space, the algorithm uses two techniques from neuroevolution called novelty map population and spectrum-diversity vargas2017spectrum . Therefore, the population is divided into subpopulations with similar spectrum (spectrum is a histogram of features of a model defined by the designer) building up a novelty map population. Here we use the spectrum as a histogram containing the features: number of blocks, number of total layers, number of block connections, number of total layer connections, number of dense layers, number of convolution layers, number of dense to dense connections, number of dense to convolution connections, number of convolution to dense connections and number of convolution to convolution connections.

Regarding the mutation operators used to evolve the architecture, they can be divided into layer, block and model mutations which can only be applied to respectively layer, block and model populations. The following paragraphs define the possible mutations.

Layer Mutation

Layer mutations are of the following types: a) change kernel - changes the kernel size of the layer, b) change filter - changes the filter size of the layer, c) change units - changes the unit size of the layer, d) swap layers - chosen layer is swapped with a random layer from the layer population.

Block Mutation

Block mutation change a single block in the block population. The possibilities are: e) add layer - a random layer is added to the block, f) remove layer - a random layer is removed from the block, g) add layer connection - a random connection between two layers from the chosen block is added, h) remove layer connection - A random connection between the two layers from the chosen block is removed, i) swap blocks - chosen block is swapped with a random block from the population.

Model Mutation

Model mutation modify a given architecture. The possible model mutations are: j) add block - a random block is added to the model, k) remove block - a random block is removed from the model, l) add block connection - a random connection between the two blocks is added, m) remove block connection - a random connection between the two blocks is removed.

All mutations add a new member to the population instead of substituting the previous one. In this manner, if nothing is done, the population of layers and blocks may explode, increasing the number of lesser quality layers and blocks. This would cause the probability of choosing good layers and blocks to decrease. To avoid this, when the layer or block population exceeds 50 individuals, the only layer/block mutation available is swap layers/blocks.

5 Experiments on RAS and Converted AAS

Here experiments are conducted on both the proposed RAS and converted versions of DeepArchitect and Smash. The objective is to achieve the highest robustness possible using different types of architecture search algorithms and compare their result and effectiveness. Originally, DeepArchitect and Smash found architectures which had an error rate of and respectively when the fitness is only based on the DNN’s testing accuracy. However, when the accuracy on adversarial samples is included in the function evaluation, the final error rate drops to and respectively (Table 2). This may also indicate that poisoning the dataset might cause a strong decrease in accuracy for the architectures found by Smash and DeepArchitecture. In the case of RAS, even with a larger search space an error rate of is achieved.

Architecture Search Testing ER ER on Adversarial Samples
DeepArchitect* negrinho2017deeparchitect 25% 75%
Smash* brock2017smash 23% 82%
Ours 18% 42%
Table 2: Error Rate (ER) on both the testing dataset and adversarial samples when the evaluation function has both accuracy on the testing data and accuracy on the adversarial samples. *Both DeepArchitect and Smash had their evaluation function modified to be the sum of accuracy on the testing and adversarial samples.

Regarding the robustness of the architectures found, Table 2 shows that the final architecture found by DeepArchitect and Smash were very susceptible to attacks, with error rate on adversarial samples of and respectively. Despite the inclusion of the (measured accuracy on adversarial samples) on the function evaluation, the architectures were still unable to find a robust architecture. This might be a consequence of the relatively small search space used and more focused initialization procedures.

In contrast, the proposed method (RAS) finds an architecture which has an error rate of only on adversarial samples. This is a similar error rate improvement as the ones achieved by networks trained with adversarial training. Note, however, that in the case of the evolved architecture, this is an inherent property of the architecture found. The architecture is inherently robust without any kind of special training or defense such as adversarial training (i.e., the architecture was only trained on the training dataset). In fact, the addition of defenses should increase its robustness further.

6 Analyzing the Final Architecture: Searching for the Key to Inherent Robustness

RAS found an architecture that possesses an inherent robustness capable of rivaling current defenses. To investigate the reason behind this robustness we can take a deeper look at the architecture found.

Figure 2: Two fragments of the evolved architecture which has peculiar traits.

Figure 2 and 3 show some peculiarities from the evolved architecture: multiple bottlenecks, projections into high-dimensional space and paths with different constraints.

Multiple Bottlenecks and Projections into High-Dimensional Space

- The first peculiarity is the use of Dense layers in-between Convolutional ones. This might seem like a bottleneck similar to the ones used in variational autoencoders. However, it is actually the opposite of a bottleneck (Figure 

3), it is a projection in high-dimensional space. The evolved architecture uses mostly a low number of filters while in some parts of it high-dimensional projections exist. In the whole architecture, four Dense layers in-between Convolutional ones were used and all of them projects into higher dimensional space. This is an application of the Cover’s Theorem which states that projecting into high dimensional space makes a training set linearly separable cover1965geometrical .

Figure 3: Deeper look at the two fragments from Figure 2, showing the size of input and output for each of the layers. The top fragment corresponds to the left fragment of Figure 2 and the bottom one corresponds to the right fragment.

Paths with Different Constraints

- The second peculiarity is the use of multiple paths with different number of filters and output sizes after high-dimensional projections. Notice how the number of filters differ in each of the Convolutional layers in these paths. This means there are different constraints over the learning in each of these paths which should foster different types of features. Therefore, this is a multi-bottleneck structure forcing the learning of different sets of features which are now easily constructed from the previous high-dimensional projection.

It is interesting to note that although RAS could have evolved architectures with multiplication and addition, none of these survived into the last generations. This demonstrates that combinations of these features are either not robust or are hard to develop into a robust architecture. Connections from Dense to Convolutional layers and vice-versa, on the contrary, are used in most if not all of the best models.

7 Analyzing RAS

Figure 4 shows the mean accuracy of the architectures evolved increases over time. The pattern of behavior is typical of evolutionary algorithms, showing that the evolution is happening as expected.

In Figure 5, the overall characteristics of the evolved architectures throughout the generations are shown. The average number of blocks and the connections between them increase over the generations. However, the average number of layers never reaches the same complexity as the initial models. The number of layers decreases steeply initially while slowly increasing afterwards. Therefore, the overall behavior is that blocks becomes smaller and numerous. A consequence of this is that the number of connections becomes proportional to the number of connections between blocks and therefore exhibit similar behavior. The average number of layers per block and average number of connections shows little change, varying only and respectively.

Figure 4: Accuracy improvement over the generations.
Figure 5:

Overall distribution of the architectures found in each generation. The connections from the input layer and the softmax layer are always present and, therefore, they are omitted in the calculation.

8 Conclusions

Automatic search for robust architectures is proposed as a paradigm for developing and researching robust models. This paradigm is based on using adversarial attacks together with error rate as evaluation functions in AAS. Experiments on using this paradigm with some of the current AAS had poor results. This was justified by the small search space used by current methods. Here, we propose the RAS method which has a wider search space including connections between dense to convolutional layer and vice-versa, multiplication, addition and concatenation. Results with RAS showed that novel architectures which are inherent robust exist. In fact, the evolved architecture achieved robust results comparable with state-of-the-art defenses while not having any special training or defense. In other words, the evolved architecture is inherently robust.

Moreover, investigating the reasons behind such robustness have shown that some peculiar traits are present. The evolved architecture has overall a low number of filters and many bottlenecks. Multiple projections into high-dimensional space are present to possibly facilitate the separation of features (Cover’s Theorem). It also uses multiple paths with different constraints after the high-dimensional projection. Consequently, causing a diverse set of features to be learned by the network.

Thus, in the search space of DNNs more robust architectures exist and more research is required to find and fully document them as well as their features.


This work was supported by JST, ACT-I Grant Number JP-50166, Japan. Additionally, we would like to thank Prof. Junichi Murata for the kind support without which it would not be possible to conduct this research.


Supplementary Work

Appendix A Full Plots of the Evolved Architecture

Figures 6 and 7 shows the complete structure of the evolved architecture by RAS. In Figures 8 and 9, there is another perspective from the same plot which has different details available such as the input and output size of each layer.

Figure 6: Full plot of the evolved architecture, showing most of the parameters (continued in Figure 7).
Figure 7: Full plot of the evolved architecture, showing most of the parameters (continuation of Figure 6).
Figure 8: Full plot of the evolved architecture, with the input and output shapes shown (continued in Figure 9).
Figure 9: Full plot of the evolved architecture, with the input and output shapes shown (continuation of Figure 8).

Appendix B All Convolutional RAS Variation

We tried a variation of RAS in which only combinations of convolutional layers were allowed. The results are posted in Table 3. As expected, the results were not as promising (worse error rates and deeper architectures) as the ones using a wider search space and therefore we skipped the rest of the tests. Table 3 also shows the accuracy of RAS when using only test accuracy as the function evaluation.

Algorithm Robustness Evaluation Original Evaluation
Test ER Adversarial ER Test ER Adversarial ER
DeepArchitect 25 75 11 79
Smash 23 82 4 89
Ours (Dense+Conv) 18 42 17 47
Ours (Only Conv) 25 45
Table 3: The function evaluation for the original one is set to be equal to the accuracy of the classifier while the robustness one uses the accuracy summed with the accuracy over the adversarial samples. Test ER stands for error rate on the testing dataset. Adversarial ER stands for error rate on the adversarial samples.