RNN-Test: Adversarial Testing Framework for Recurrent Neural Network Systems

11/11/2019 ∙ by Jianmin Guo, et al. ∙ 0

While huge efforts have been investigated in the adversarial testing of convolutional neural networks (CNN), the testing for recurrent neural networks (RNN) is still limited to the classification context and leave threats for vast sequential application domains. In this work, we propose a generic adversarial testing framework RNN-Test. First, based on the distinctive structure of RNNs, we define three novel coverage metrics to measure the testing completeness and guide the generation of adversarial inputs. Second, we propose the state inconsistency orientation to generate the perturbations by maximizing the inconsistency of the hidden states of RNN cells. Finally, we combine orientations with coverage guidance to produce minute perturbations. Given the RNN model and the sequential inputs, RNN-Test will modify one character or one word out of the whole inputs based on the perturbations obtained, so as to lead the RNN to produce wrong outputs. For evaluation, we apply RNN-Test on two models of common RNN structure - the PTB language model and the spell checker model. RNN-Test efficiently reduces the performance of the PTB language model by increasing its test perplexity by 58.11 behaviors of the spell checker model with the success rate of 73.44 average. With our customization, RNN-Test using the redefined neuron coverage as guidance could achieve 35.71 DeepXplore.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

As the core part of the current artificial intelligence applications, deep learning has made great breakthroughs on computer vision

[47, 22]

, natural language processing

[8] and speech recognition[19, 2]. With the increasingly deployments of deep neural network (DNN) systems in the security critical domains, like automated driving[4] and medical diagnose[43], ensuring the robustness of DNNs becomes the essential part in the communities.

However, it is demonstrated that state-of-the-art DNNs could produce completely different predictions, when fed with the adversarial inputs[48]. This inspired numerous adversarial testing works devoted to generate adversarial inputs for the DNNs, aiming to provide rich sources for training the DNNs to be more robust. The majority of these works[13, 35, 33] mutate the inputs based on the perturbations obtained by gradient descent. They exhibit high efficiency in generating adversarial inputs, but achieve low testing completeness[38]. Recent years, multiple coverage criteria[38, 46, 29] are proposed measuring the coverage achieved in various granularities in the testing. They believe that reaching higher coverage could increase the confidence of the reliability of the DNN.

In spite of the effectiveness of these works, they are largely limited to the CNNs. Overall, there are two main types of DNNs, the convolutional neural networks (CNN)[26] and recurrent neural networks (RNN)[39]. They are of different structures and suited for different kinds of tasks. CNNs introduced the convolution layer and pooling layer to the traditional fully connected DNNs, and have excellent performance in image processing applications[44, 18]. RNNs are known for their iterative structures and the support of temporal information, hence good at handling tasks with sequential data, like natural language processing[32] and speech recognition[14]. Owing to the gap between their structures, the adversarial testing on the two types of DNNs are hard to fit the other.

The adversarial testing for RNNs face certain challenges, summarized as threefold. First, there is no rule to recognize the adversarial input without the obvious class label. For the sequential outputs not then applied to classification, there is no standard to decide the outputs as wrong outputs with respect to the changing degree. Second, mutating the sequential inputs like texts is hard to ensure the minute perturbation. Applying the perturbations to words in a discrete space always cannot obtain a legal input and the explicit modification is distinguishable for humans. Third, the existing neuron based coverage metrics of CNN fail to consider the characteristics of RNN structures and could not be adopted directly.

Benefit from the simpler adaptation of works on CNNs, the existing works on RNNs have also applied adversarial testing to the classification domains. They perform well in specific tasks, such as sentiment analysis of texts

[42, 40] and email classification[23], etc. For text inputs, most works add, modify or delete a word/character to make sure the minute alteration as a way against the second challenge. But the adversarial testing for RNNs with the main sequential domains are always left away or rather inadequate[36, 34]

, with the first challenge not well addressed. Besides that, the coverage metrics defined for the DNNs are also based upon CNNs, which have thousands of neurons activated by the activation function RELU. Instead, RNNs have significantly fewer states activated by sigmoid and tanh, with completely different value ranges. But this key issue is neglected by the relevant works

[10] for RNNs.

In this paper, we propose a generic adversarial testing framework RNN-Test for recurrent neural networks, with no limit to tasks. First, we define three coverage metrics targeting the particular computation logics of RNNs. Then, RNN-Test primarily adopts the joint optimization of maximizing the adversary orientations and boosting the coverage, which enables the perturbations obtained in a gradient-based way. In the adversary orientation module, we propose the state inconsistency orientation to maximize the inconsistency of the hidden states and lead the model to produce wrong outputs, with the cost orientation adapted from FGSM[13] and the decision boundary orientation from DLFuzz[16]. In the coverage boosting module, we first employ the coverage guidance to obtain the perturbations, not only as the indicator of testing completeness and the key goal to improve. Note that we only keep the perturbations of one word/character to modify out of the whole inputs, thus ensuring the tiny modification. Finally, we address the first challenge by leveraging the performance metrics of the tested models to assess the qualities of the adversarial inputs.

For evaluations, we select the PTB language model[50] and a spell checker model[1] based on their general structures and common applications, and implement a customized version of neuron coverage in DeepXplore[38] for comparison. On the PTB model, RNN-Test demonstrates its effectiveness in adversarial input generation by increasing the test perplexity by 58.11% on average, where the state inconsistency orientation declines the model performance most among the three orientations. With coverage boosting, the redefined coverage of DeepXplore as guidance achieves 35.71% higher perplexity than with random strategy of DeepXplore. Furthermore, we retrain and improve the model by 1.159% using the augmented training set with adversarial inputs. On the spell checker model, the adversarial inputs result in the corrected mistakes emerging again with the success rate of 73.44% averagely. It is remarkable that the coverage guidance achieves the highest success rate of 74.29% with the aid of the boosting procedure.

To summarize, our work has the following contributions:

  • We define three coverage metrics customized for RNNs and first exploit the coverage boosting procedure to directly generate the adversarial inputs. During experiments, we found that there is no linear correlation between the coverage value and the qualities of adversarial inputs, and more efforts should be paid to improve the qualities of inputs not the value of coverage.

  • We propose the state inconsistency orientation to lead the tested RNN models to behave worse, which is also effective for adversarial input generation.

  • We design, implement and evaluate the generic adversarial testing framework RNN-Test, which is scalable for variants of RNNs without the limit of the application contexts and support multiple combinations of orientations and coverage metrics freely.

  • We demonstrate the effectiveness of RNN-Test on two RNN models. RNN-Test could efficiently generate adversarial inputs and improve the PTB model by retraining with the augmented training set.

We organize this paper as follows. In Section II, we provide the background related to RNN. In Section III, we formally describe the design of RNN-Test in detail. Section IV presents the evaluation results of RNN-Test. Section V discusses the threats to validate the work. In Section VI, we introduce the related works. Section VII makes a conclusion.

Ii Background

Ii-a Deep Neural Network

From the biology view of perspective, artificial neural networks were initially designed to imitate the structure of biological neurons with an activation process. The difference of deep neural networks with respect to shallow neural networks lies in more hidden layers to perform complex computation. A fully connected network requires each neuron to establish connections with all neurons in adjacent layers. Fig. (a)a shows the structure of a traditional DNN and Fig. (b)b for a typical neuron of DNN. For the traditional DNNs, the direction of data flow is from input layer and hidden layers and then to the output layer. Furthermore, CNNs keep the main feed forward structure and introduce the convolution layer and pooling layer to better extract the features of the inputs, which are mostly images. Note that the activation function used in CNN is usually RELU, which keeps the positive output value as the same but treats other values as 0, so having infinite upper bound.

(a) Fully connected network
(b) Typical neuron
Fig. 1: Typical DNN structure

Ii-B Recurrent Neural Network and the variants

RNN is widely applied in temporal sequence analysis. For traditional deep neural networks like CNNs, neurons in adjacent layers are fully connected while neurons within the same layer has no explicit relations. As a result, CNNs cannot deliver the context information surrounding the input data very well.

An example of such tasks can be, for instance, predicting next word with the knowledge of previous sentence. Because of the semantic relationship between two words in a sentence, we have to take the sequence of previous words into consideration to predict next word. Fig. 2 depicts the typical RNN structure and formula (1) summarizes the computation process of RNN cell. The hidden state output of the cell at time step in layer is decided by current input from the previous layer as well as from the previous step in the same layer, and then passed forward to compute the softmax predictions. Consequently, RNN is able to represent the context information in temporal sequences, which makes it appropriate for the natural language processing tasks. Besides of this key design, common RNNs always comprise two or three layers each with several states when unfolded, much fewer than CNNs usually of ten more layers each with hundreds of neurons. Moreover, activation functions sigmoid and tanh are commonly used and important in our definitions of coverage metrics.

Fig. 2: RNN structure

Variants of RNNs

. Although RNNs could solve most tasks involving time series, they still encounter the problem of gradient vanishing or gradient exploding, which causes the network unable to learn the dependency within long time steps. Then LSTM(Long short-term memory)

[21, 51] and GRU[7] bring the gate mechanism allowing RNNs to learn the context information from farther time steps. They are both widely applied in the related tasks now.

(a) LSTM structure
(b) LSTM cell
Fig. 3: LSTM structure.

Taking LSTM as an example, its structure and the inner cell is given in Fig. 3, along with the computation process in formula (2). There are cell states and gates participating in the computation, where stand for input gate, forget gate, output gate, new input gate respectively, utilized to decide the flow and weights of different parts of inputs.

Iii RNN-Test Approach

Iii-a RNN-Test Overview

The overall workflow of RNN-Test is depicted in Fig. 4. The workflow is not restricted to the classification context but universally applicable for both the sequential and classification contexts.

RNN-Test relies on three core modules to generate adversarial inputs for recurrent neural networks, which are RNN wrapper, adversary orientation maximizing and coverage boosting. RNN wrapper extracts the hidden states and cell states of each RNN cell in the given RNN, without affecting its inherent process. The states obtained are crucial for adversarial input generation and utilized in both the other two modules. Additionally, our coverage metrics defined based on the states are given in § III-B and § III-C.

In the module of adversary orientation maximizing, RNN-Test integrates three orientation methods, including our proposed state inconsistency orientation and two orientation methods adapted from other works[13, 16] performing well in CNN testing, described in § III-E. These methods search the adversarial inputs by maximizing the orientations designed to lead the RNN to expose wrong behaviours. Meanwhile, the module coverage boosting aims to generate adversarial inputs by searching in the uncovered space of RNNs, referring to § III-F. The orientation methods and coverage guidances in the two modules are free to be integrated together, thus allowing RNN-Test to explore better means.

Finally, the integrated modules will produce a joint objective. Maximizing the objective by gradient ascent could obtain the perturbation to modify the test input. Here we just randomly modify one word or character out of the whole sequential input, ensuring the modification is little enough to maintain the original semantic meaning. As the words and characters are in a discrete embedding space, the minute perturbation applied to the test input probably will not lead to a legal input. We adopt the nearest embedding as the adversarial input after iteratively scaling the perturbation.

Fig. 4: Architecture of RNN-Test

Iii-B Key insights for coverage metrics.

Insights for state coverage. Based on the distinctive structure of RNN, the outputs of each RNN cell are hidden states, denoted as

, which are vectors. For LSTM cell, the outputs also incorporate cell states, denoted as

, also vectors always of the same shape as . The computation procedure is illustrated in § II.

In the procedure, the hidden states play a key role for the prediction, used to map to the prediction results. For one input, if a specific hidden state has the maximum value of the RNN cell outputs, the probabilities of its mapping part of the prediction result tend to be higher as well. Thus, combinations of hidden states lead to the varying prediction results. As covering the permutations of each hidden state of the RNN cell is extremely time-consuming, covering all the maximum hidden states is a feasible solution. The definition for hidden state coverage is given in formula (3).

In LSTM structure, cell states are activated by tanh function, and then to compute the hidden states of the same cell. As shown in Fig. (a)a, the output value of tanh function ranges from to . According to our statistics, the activation values of cell states mostly fall into the central range while few be the boundary value. Hence we could measure the coverage over the different ranges, but too many sections like deepGauge[29] will be nonsense due to the narrow value range. We split the sections each ranging from to , where are in the set according to tanh distribution. We suppose that covering more of each section, especially the boundary sections, could exercise more computation logics.

(a) tanh
(b) sigmoid
Fig. 5: Graph of the activation functions, with end points of sections as dots.

Insights for gate coverage. Multiple gates designed are a prominent characteristic of GRU and LSTM model. In the implementation, their gates are split from the concatenated hidden states and , as in formula (2). Then they are used for computing and after activation.

Similar to the statistics of activation values of cell states, the activation values for each gate are also mainly in the central range. We employs the same mechanism to compute the gate coverage, by first splitting the value range to several sections and then recording the coverage of each section. Moreover, the sections for the gates activated by tanh are the same as above, and the sections for other gates activated by sigmoid are also separated based on its distribution, shown in Fig. (b)b, where from 111The two sections are of these values in the boosting procedure but with wider boundary sections when recording coverage, convenient for evaluations..

Iii-C Coverage definitions

Hidden state coverage. Assume all the hidden states of an RNN model are represented by . is a matrix of shape , where are the number of the time steps, layers and batch, respectively. is the state size. The shape of varies among the RNN models, but are the necessary components. Note that modern DNNs always process inputs in batch to accelerate the computation.

For a specific hidden state vector , where , and . If a state and , then the state is covered. Thus, the hidden state coverage is computed as the below formula (3).


Cell state coverage. All the cell states of an RNN model are denoted as , which is also a matrix of shape . The value range of the activation function is split to sections and each section is represented as , where . For a specific cell state vector , if a state and , then is covered in . The cell state coverage in each section is given in formula (4).


Gate coverage. All the states that are utilized to compute the gates of the RNN model are represented by . The states for each type of gate are denoted as , for LSTM model as in § II-B. is a matrix of shape , is the state size for gate . If a state and and is the activation function for gate , then is covered in . Thus, gate coverage is computed as formula (5) below.


DX coverage. We also customize the neuron coverage in DeepXplore owing to the great difference between traditional DNN structure and RNN structure. For CNNs, DeepXplore treats each feature map (outputs of the convolution layer, a matrix of hundreds of values) as a neuron and takes the mean value as the output. If the same as DeepXplore, hidden states of each cell will be treated as a neuron. Then, a common RNN like the PTB model will only consist of fewer than 100 neurons which one layer of CNN owns, and the coverage value will be 100% just with several inputs. So we regard each hidden state as the neuron.

For all the hidden states and a state , if its output value after min-max normalization where is the user-defined threshold, then is covered, as in formula (6).


Iii-D Adversary search

The core algorithm of RNN-Test is presented in Algorithm 1, in which the procedure COVERAGE_BOOST is given in Algorithm 2 and procedures retrieve_states, get_orient are described in the following and § III-E, respectively.

1:inputs sequential inputs for testing
2:   model RNN model under test
3:   obj_mode orientation, coverage, or joint objective
4:   orient_mode one of the three orientations
5:   guided_cov one of the four coverage metrics
6:   embeddings embeddings of the vocabulary
7:   MAX_SCALE maximum degree of scaling the gradient
9:for  in inputs do
10:     /*randomly select one time step(a word/character) to modify*/
11:      = random.sample()
12:     , = retrieve_states(model)  //get hidden states and cell states
13:      = predict()  //originally predict
14:     obj_orient = get_orient(obj_mode, orient_mode, , , )
15:     obj_cov = COVERAGE_BOOST(obj_mode, guided_cov, , )
16:     obj = obj_orient + obj_cov
17:     grads = obj
18:      = GEN_ADV(, , grads, embeddings)
19:     if  !=  then
20:          is_generate = True  //whether obtain the adversarial input
21:           = predict()  //predict the adversarial input      
22:     update_coverage(, , guided_cov)
23:     evaluate the model performance  //record the metrics
24:/*generate the adversarial inputs*/
25:procedure gen_adv()
27:     for scale in range(1, MAX_SCALE) do
28:          pert = grads[] scale  //perturbation for the time step
29:           = + pert  //gradient ascent
30:          for emb in embeddings do  //distances of to embeddings
31:               dist_vector.append(norm( - emb))           
32:          min_emb = min(dist_vector)  //the nearest embedding
33:          if min_emb !=  then
34:                = min_emb
35:               break                
36:     return
Algorithm 1 RNN-Test algorithm

RNN wrapping. In the inherent implementation of a deep RNN model taking a sequential input, it will output two data elements before making predictions, which are all the hidden states of the last layer, and all the hidden states and cell states of the last time step. For the subsequent workflow, we need the access to all the hidden states and cell states of each layer and time step. We wrap the RNN cell implementation and keep the hidden states and cell states of each cell, thus making all the states available, corresponding to procedure retrieve_states in line 4.

Joint optimization objective. Opposed to the training course minimizing the prediction error by tuning the parameters to achieve the desired performance, adversarial testing tries[13, 33] to maximize their objectives by mutating the test inputs to discover errors. Different optimization objectives will make the RNN model target to different outputs when mutating the input, with diverse capabilities of discovering adversarial inputs. We explore multiple alternatives and combinations of objectives for adversarial testing for RNN models. The optimization objective here includes two components (Algorithm 1 line 8), adversary orientation maximizing and coverage boosting, corresponding to in § III-E and in § III-F. Note that the two components can be utilized independently or combined together. Moreover, taking the derivatives of with respect to the input could obtain the gradient direction along which increases or decreases most (Algorithm 1 line 9). Afterwards RNN-Test mutates the input by scaling the gradients (line 20) and then applying to the input as the perturbations (line 21), thereby maximizing the objective and obtaining the adversarial inputs.

Why use the nearest embedding. In the procedure GEN_ADV, we iteratively apply the perturbations and then search the nearest word/character in the embedding space as the adversarial inputs (line 22 to 27 in Algorithm 1). Due to the discreteness of the embedding space of NLP tasks, this is a straightforward way to obtain the adversarial inputs. Besides that, the embedding representations of words or characters in each NLP task are acquired after enough training, which could unveil the semantic properties of the words or characters and solve the task. Searching in the given embedding space could get the adversarial inputs with the existing semantic information.

Model performance metrics. In sequential tasks, there is no obvious label of the predicted output to identify a generated sequence as the adversarial input, unless introducing the classification labels but violating the principle of generality. Fortunately, the metrics measuring the model performance are a good choice to exhibit the qualities of the adversarial inputs (Algorithm 1 line 15), which are supposed to be accessible in all the tasks.

Iii-E Adversary orientation maximizing

In RNN-Test, we explore three adversary orientations in adversarial testing for RNNs, including our proposed state inconsistency orientation, adapted cost orientation in FGSM[13] and decision boundary orientation in DLFuzz[16].

State inconsistency orientation. The state inconsistency orientation is designed based upon the inner logic of RNN cell. As shown in formula (1) and (2), the states increase linearly with the states of and , if implemented. Therefore, the state inconsistency orientation tries to increase and while decrease simultaneously, leading the RNN to unusual behaviours, which is formulated in formula (7).


Cost orientation. FGSM and many other works[13, 5] generate the adversarial inputs by maximizing the loss of the predicted output label and original output label. For sequential tasks in RNN, the loss

is mostly the weighted cross-entropy loss for a sequence of logits

[25], briefly listed in formula (8), which is encapsulated in the implementation of the model and is accessible via APIs.


Decision boundary orientation. Decision boundary orientation is designed to decrease the probability of the original predicted label and increase the probabilities of other top k labels in prediction. For RNN testing, we adapted this idea with respect to the specific time step to mutate in the input, as its outputs are also a vector with softmax probabilities, formulated in (9).


Iii-F Coverage boosting

The coverage boosting module targets to cover the uncovered states and sections, in this way to search for adversarial inputs. As in formula (10), RNN-Test selects hidden states or cell states to boost their values. Besides the strategy of randomly selecting the states uncovered or with uncovered boundary sections, RNN-Test also adopts boosting procedure to select the states with values near the boundary section endpoints and guides their values to reach the boundaries, as in Algorithm 2. This given procedure is for the coverage metrics defined on a series of sections. For and DX coverage, the procedure selects states with values close to be covered.

1:procedure COVERAGE_BOOST()
2:     /*Describe the procedure for CS_C boosting as an example*/
3:     _sorted = sort()
4:     _sorted_r = sort()[::-1]  //sort the states reversely
5:     low_id the first in _sorted with value
6:     high_id the first in _sorted_r with value
7:     lower_states = _sorted[low_id: low_id + ]
8:     higher_states = _sorted_r[high_id: high_id + ]
9:     obj_cov = higher_states - lower_states
10:     return obj_cov
Algorithm 2 Coverage boosting procedure

Iv Experiment

Iv-a Experiment Setup


We developed the framework RNN-Test on the widely deployed framework tensorflow 1.3.0, and evaluated RNN-Test on a computer having Ubuntu 16.04 as the host OS, with an Intel i7-7700HQ@3.6GHz processor of 8 cores, 16GB of memory and an NVIDIA GTX 1070 GPU.

We evaluate RNN-Test on two RNN models processing sequential tasks, including PTB language model of basic LSTM structure as in Fig. (a)a, and a sequence-to-sequence (seq2seq) spell checker model with a bi-direction LSTM in the encoding layer and Bahdanau Attention[3] in the decoding layer. These two models are selected due to their general structures and application contexts.

PTB language model[50] is a popular RNN model on Penn Tree Bank dataset, which is the implementation of [51]. It takes a part of texts as input and predict the subsequent texts, that is, the word after each input word. We trained the word-based PTB model with test perplexity of 117.54 on its ‘small’ config, consistent with the result reported in [51]

. This model could be used for text generation, which is to generate new texts similar to the style of the trained text data.

Seq2seq spell checker model[1] receives a sentence with spelling mistakes as input, and outputs the sentence with the mistakes corrected. We trained the character-based model with the sequence loss of 10.1%, similar to 15% they reported. The training data used are twenty popular books from project Gutenberg[12]. We construct 160 test sentences with spelling mistakes like the example sentences they give, thanks to rich sources from Tatoeba[49].

Research Questions (RQs): We constructed the experiments to answer the following research questions.

  • RQ1. Are the adversary orientation maximizing methods effective for adversarial input generation? (§ IV-B)

  • RQ2. Are the proposed coverage metrics helpful for adversarial input generation? (§ IV-C)

  • RQ3. Could retraining with the adversarial inputs improve the RNN models? (§ IV-D)

Evaluations metrics. To answer RQ1 and RQ2, we also present the the performance of the tested models on the original test set, as well as the set of adversarial inputs obtained by randomly replacing a word/character of each input as the baseline setting. Here we list the performance metrics of the tested models plus with other necessary metrics.

  • . Test perplexity, the inverse probability of the test set as universe metric for language models, where lower perplexity corresponds to better model.

  • . Test perplexity of each input on average.

  • WER. Word error rate, the correlation of predicted outputs with the ground truth as generic metric for seq2seq models, where higher WER means worse predictions.

  • BLEU. Bilingual evaluation understudy, similar to WER but higher BLEU means better predictions.

  • gen_rate. Ratio of the test set the method has successfully produced the adversarial input.

  • orient_rate. Ratio of the generated set obtained by our method, not at random.

  • suc_rate. Ratio of the generated set the corrected mistakes in the original input appearing in the prediction result of the adversarial input.

  • norm. Distortion of the perturbation.

Iv-B Effectiveness of the adversary orientation methods (RQ1)

We run each orientation method recording each coverage metric on the tested models for 3 times, the same in the baseline setting, so as to alleviate the uncertainty running each time. Each coverage guidance and each combination of joint objectives are also run 3 times in following assessments. In below presented results, we denote the orientation method by their first word, the coverage guidance by the notation of the definitions in  § III-C and their combination as the respective joint objective.

The average results of orientation methods are summarized in Table I and Table II, leaving the achieved coverage and samples of adversarial inputs given in RQ2, convenient for the comparison. In Table I, the set of adversarial inputs always reach higher perplexity than the original test set, inferring that the adversarial inputs could lead the model to expose worse behaviours. Overall, the orientation methods could obtain 1.7% higher perplexity than the baseline setting, but the cost and decision boundary orientations obtain lower perplexity on the whole test set than the baseline, maybe due to the smaller distortion and gen_rate.

Orientation gen_rate norm.
original 355.11 157.76
baseline 554.04 253.20 100%
cost 603.78 245.05 95.94% 28.43
decision 584.86 252.39 94.30% 30.96
state 618.19 275.08 100% 82.14
avg. orientation 602.28 257.51 96.75% 47.18
TABLE I: Effectiveness of the adversary orientation methods on PTB model. The best result across the column is denoted in bold and the last row for average results of our methods, the same in following tables.
Orientation WER BLEU orient_rate norm. suc_rate
original 3.71 0.915
baseline 5.04 0.876 65.48%
cost 5.10 0.877 97.38% 0.002 70.24%
decision 5.33 0.871 100% 2.32 74.29%
state 5.26 0.871 100% 10.39 73.33%
avg. orientation 5.23 0.873 99.13% 4.24 72.62%
TABLE II: Effectiveness of the adversary orientation methods on spell checker model, each with gen_rate as 100%.

For spell checker model, the adversarial inputs obtained by the orientation methods could also reduce the model performance, with higher WER and lower BLEU score. But except the state inconsistency orientation with 100% gen_rate, the cost orientation and decision boundary orientation achieve relatively low gen_rate, with the former 74.29% and the latter 85.71%. Considering the fairness, we boost the gen_rate of each method to be 100% by attempting to modify each character of the targeted input until the adversarial inputs obtained otherwise randomly replaced. The results are summarized in Table II, showing that the orientations could achieve 40.97% and 3.77% higher WER, 4.59% and 0.34% lower BLEU score than the original and baseline setting respectively, and also 7.14% higher suc_rate than the baseline.

The answer to RQ1: The state inconsistency orientation could increase 1.7% more perplexity of the PTB model than baseline. All the orientation methods can achieve higher suc_rate on the spell checker model than baseline, with the state inconsistency orientation always of 100% gen_rate.

Iv-C Effectiveness of the coverage metrics (RQ2)

Effectiveness of coverage guidance for adversarial input generation. Table III provides the results of multiple coverage metrics as guidance for the adversarial input generation on the spell checker model. On the PTB model, only guidance achieves 1.7% higher perplexity than the baseline, whereas others lower than the baseline, not listed here.

In Table III, all the coverage guidances are effective in adversarial input generation and better than the baseline, where and are not implemented due to the model structure. The gate coverage metrics and achieved highest suc_rate with smallest perturbations. Moreover, reached best WER and BELU score with 100% orient_rate while the adapted DX also gained good results.

Coverage WER BLEU orient_rate norm. suc_rate
original 3.71 0.915
baseline 5.04 0.876 65.48%
4.87 0.878 100% 0.02 71.43%
5.68 0.871 100% 2.41 72.62%
5.10 0.874 97.62% 0.01 77.38%
5.24 0.874 98.81% 0.01 76.19%
DX 5.42 0.872 97.22% 0.48 73.81%
avg. coverage 5.26 0.874 98.73% 0.59 74.29%
TABLE III: Effectiveness of the coverage metrics on spell checker model, each with gen_rate as 100%.

The enhancement of coverage guidance to adversarial input generation. RNN-Test supports the various joint objectives of the orientations and coverage guidances to search for the better means for adversarial testing. Table IV and Table V present each orientation with two coverage guidances with the highest and suc_rate among all the combinations for the two models respectively.

On both models, the state inconsistency orientation together with guidance achieved better results than other objectives, except the enormous perturbations. As Table I and Table III, they each produced much larger perturbations and become even unusual when combined here. Upon this issue, we have attempted to restrict the perturbations of all the methods by dividing with their respective norm, which make their norm all less than 21. Nevertheless, after restriction, only the state inconsistency orientation combined with guidance still get 1.7% higher perplexity than the baseline but all the others not. This implies that restriction could not be a good choice for other methods on the PTB model. In contrast, the results for the spell checker model vary little after restriction.

Next, on average, the joint objectives could acquire better results than the orientations and the coverage guidances on the PTB model, indicating that the coverage guidances could enhance the performance of the adversarial input generation. Simultaneously, the coverage guidances obtained the highest suc_rate and WER on the spell checker model. Note that, the coverage guidance methods always have the smallest perturbations. Overall, RNN-Test increases the test perplexity by 58.11% than the original setting for PTB model, and acquire adversarial inputs with the success rate of 73.44%, both averaged over the results of all our methods.

Finally, the samples of adversarial inputs on the tested models are listed in Table VI, with each method to modify the same word/character. For the PTB model, different methods tend to generate different words. But for the spell checker model, most of the methods incline to generate the same character, except the method state+, maybe because of the sparse embedding space.

Joint objective gen_rate norm.
cost+ 595.91 255.73 96.17% 30.30
cost+ 575.23 254.78 98.36% 34.34
decision+ 572.99 260.11 98.36% 36.04
decision+ 602.23 267.06 95.08% 32.26
state+ 684.61 299.60 100% 6210.00
state+ 671.20 286.39 99.45% 1940.97
baseline 554.04 253.20 100%
avg. joint 605.79 258.56 97.77% 1093.29
avg. orientation 602.28 257.51 96.75% 47.18
avg. coverage 576.69 232.22 89.50% 19.47
TABLE IV: Effectiveness of the diverse objectives on ptb model, where avg. joint are the results for all combinations of objectives. The best result across the column among the three methods is in bold red and the same in below table.
Joint objective WER BLEU orient_rate norm. suc_rate
cost+ 4.95 0.875 97.62% 0.03 73.81%
cost+ 5.19 0.873 97.62% 0.01 71.43%
decision+ 5.24 0.872 100% 2.39 77.38%
decision+ 5.11 0.871 100% 2.07 76.19%
state+ 5.48 0.870 100% 467.93 77.38%
state+ 5.08 0.871 97.62% 97.98 76.19%
baseline 5.04 0.876 65.48%
avg. joint 5.19 0.873 98.65% 59.38 73.41%
avg. orientation 5.23 0.873 99.13% 4.24 72.62%
avg. coverage 5.26 0.874 98.73% 0.59 74.29%
TABLE V: Effectiveness of the diverse objectives on spell checker model, each with gen_rate as 100%.
Objective PTB model spell checker model
Input: no it was n’t black monday, but while the
: 259.67
Input: I would swim through theoocean just to see your smile again.
Predict:I would swim through the ocean just to see your smile again.
Input: Tom called me a partypooper because I lef tthe party just afpter midnight.
Predict: Tom called me a partypooper because I left the party just after midnight.
Input: no it told n’t black monday, but while the
orientation: cost, :320.34
Input: no it the n’t black monday, but while the
orientation: decision, :506.38
Input: I would swim through thooocean just to see your smile again.
Predict: I would swim through thoocean just to see your smile again.
Input: Tom called me a partypooper because I Ief tthe party just afpter midnight.
Predict: Tom called me a partypooper because I I the party just after midnight.
Input: no it east n’t black monday, but while the
coverage: , : 1322.99
Input: no it N n’t black monday, but while the
: 446.325
Input: I would swim through thhoocean just to see your smile again.
Predict: I would swim through thooce an just to see your smile again.
Input: Tom called me a partypooper because I eef tthe party just afpter midnight.
Predict: Tom called me a partypooper because I eef the party just after midnight.
TABLE VI: Samples of adversarial inputs on the tested models, the targeted words to modify are in red and underlined.

Our customized metrics compared with the adapted DX coverage. In the evaluations, the boosting procedure is adopted for redefined DX coverage, not using that (randomly select one uncovered state) of DeepXplore. On the PTB model using DX as guidance, our boosting strategy achieves 35.71% higher perplexity than the random strategy of DeepXplore, even with more states selected. Meanwhile, the adapted DX coverage as guidance performs well for spell checker model and so for both models when combined with the orientations. Finally, the weakness of DX coverage is still evident that the coverage reaches 90% with at most four inputs on the PTB model when taking a higher threshold 0.5 in DeepXplore, thus having bad discrimination over enough inputs.

Correlation of the coverage with adversarial inputs. In previous works, researchers believe that exercising more logics of DNNs could trigger more wrong behaviours. We analysed the correlations of the evaluations metrics and the value of coverage metrics. Based on the acquired data, we could not draw the conclusion that obtaining higher coverage definitely results in more incorrect outputs. Fig. 6 presents the results with the most evident correlations, and most other results are messed up in such figures.

(a) test perplexity of PTB
(b) success rate of spell checker
Fig. 6: The evaluations metrics of adversarial inputs with the value of , the left for all the methods in different runs and the right for the coverage guidance in different runs.

Boosting the coverage. In the coverage boosting module, RNN-Test tries both the boosting procedure and random strategy to select states to cover. In our evaluations, we found that there is no gold rule to increase all the coverage metrics. They each has the advantages over the other for different tested models and coverage metrics, shown in Fig. 7. Nevertheless, in most cases, utilizing the boosting procedure brings about the better testing effectiveness, leading the model to perform worse. It must be claimed that the coverage values strongly depend on the number of test inputs, the same amount of inputs are supposed to be with similar coverage.

(a) on PTB
(b) on spell checker
Fig. 7: The distribution of coverage metrics. The labels of x axis are in short, those start with ‘0’ for orientations denoted by first several characters, ‘1’ for coverage guidance, ‘2’ for joint objectives, and those end with ‘r’ use the random strategy.

Perturbation similarity between orientation maximizing and coverage boosting. Based on the statement that the perturbations generated by the coverage guidance are similar to the orientation search and so not add much[27], we record perturbation vectors obtained over several same inputs in the experiments. To visualize, we leverage the state-of-the-art high-dimensional reduction technique TSNE[31] to transform the multi-dimensional perturbation vectors to the two-dimensional space, where orientation methods with more data. As Fig. 8 shows, there is no evident similarity of perturbation vectors of the orientations, coverage guidances and joint objectives. Together with observations of the correlation above, we guess the coverage guidance should be used as the unique way for adversarial input generation, not only the goal to improve.

(a) PTB
(b) spell checker
Fig. 8: The TSNE transformation of perturbations generated by different objectives for one same test input.

The answer to RQ2: The coverage metrics as guidance are also effective in adversarial input generation, with enhancement to the orientations on the PTB model and best performance on the spell checker model.

Iv-D Improving the RNN models with retraining (RQ3)

For CNN testing, retraining the tested models by augmenting the training set with adversarial inputs could improve the accuracy of the tested models[38, 16]

. Inspired by the impressive effects, we tried on the PTB model and incorporated adversarial inputs (82.5 KB) to the open-source training set (5.1 MB), where the adversarial inputs are obtained in the setting of decision boundary orientation. Additionally, the adversarial inputs obtained by the state inconsistency orientation achieved similar results, not listed here.

Table VII

presents the perplexity of the PTB model before and after retraining, where the train perplexity indicates the performance on the training set while the valid perplexity for the valid set. Here the data are averaged over 5 times of running the same retraining process with 12 epochs, to mitigate the affects due to the intrinsic indeterminism of neural networks. From column 4 and 7, the results show that the train perplexity of the model after retraining increases by 1.082% whereas the valid perplexity decreases by 1.159% in the end. Moreover, the original test perplexity is 117.53 and that after retraining is 102.75, thus also declined by 12.582%. Notice that even by incorporating fewer adversarial inputs (1.6KB), the valid perplexity still declines by 0.058%.

epoch train valid
original w. adv. increment original w. adv. decrement
0 290.584 288.579 -0.690% 190.004 192.096 -1.101%
2 113.216 113.712 0.439% 140.328 140.339 -0.008%
4 86.290 87.195 1.049% 132.589 132.969 -0.287%
6 56.282 56.961 1.207% 121.410 120.566 0.695%
8 46.549 47.082 1.146% 122.981 121.611 1.114%
10 43.991 44.474 1.096% 123.065 121.385 1.365%
12 43.227 43.695 1.082% 122.440 121.020 1.159%
TABLE VII: The perplexity before and after retraining on the PTB model. Column 3 and 5 for the augmented training set and column 4 and 7 for the improvement of retraining results w.r.t the original results.

Therefore, the adversarial inputs generated for RNN models are proved to have practical use for improving the models. They could alleviate the over-fitting issue of the training process by reducing little train performance, but improving the valid and test performance and thus the robustness of the RNN model.

The answer to RQ3: The adversarial inputs could also be used to improve the performance of RNN models. By augmenting the adversarial inputs of 82.5 KB to the training set of 5.1 MB, the valid and test perplexity of the PTB language model declines by 1.159% and 12.582% respectively.

V Threats to validity.

Though RNN-Test exhibits appreciable effectiveness with the default setting in the evaluations, its performance is inevitably influenced by the parameters, including the scaling degree of the perturbations, the number of states selected to boost and the weights applied to the joint objectives, especially the ways of sections splitting of and . They are worthy to be well explored in the future work, on account of the important roles. Furthermore, the uncertainty running each time still exists in the presented results, owing to different search directions over stochastic targeted word/character, which could be diminished by fixing the target.

In addition, RNN-Test is devoted to be general and scalable for the variants of RNNs, but we could not exhaustively apply the framework to all the variants and their targeted applications. In this paper, the structures of the tested models are general to some extent, but training the spell checker model still costs hard work, due to its bad reproducibility of the training results given. Moreover, the RNN wrapper is designed to avoid interfering with the computation logics of the model, but the adapting efforts may be necessary for some variants with complex structures.

Vi Related Work

Adversarial deep learning. The concept of adversarial attacks was first introduced in [48]. It discovered that DNNs would misclassify the input images by applying imperceptible perturbations, where these mutated inputs are called adversarial examples/inputs. Their work FGSM[13] and the following works[24, 33, 5, 9] generate the adversarial examples by maximizing the prediction error in the gradient-based manner. Multiple trends are then developed, including targeted attacks[5, 35] and non-targeted attacks[33, 45], whitebox attacks[13, 5] and blackbox attacks[45], defense techniques[11, 15, 20, 37, 41] and methodologies like C&W attacks[5] to construct adversarial attacks particularly against the defense methods, etc.

As metioned before, they are mostly limited to the image classification tasks. Besides, without concerns of covering the computation logics of the models, they are shown to reach low test coverage[38].

Coverage guided testing of DNN systems. DeepXplore[38] first introduces neuron coverage into deep learning testing, defined over neurons of DNNs with the pre-defined threshold, requiring redefinition for RNNs as discussed earlier. Due to the coarse granularity of neuron coverage, DeepGauge[29] defines more coverage metrics with finer-grained granularity. The key idea is to record the value range of outputs of training data as the major function region and split the region to k, e.g., 1000, sections, also not suitable for narrow value range of RNN states. DeepCT[30] is even fine-grained to measure over combinations of neuron outputs. It is noteworthy that [27] argues that these works[38, 46, 30] fail to find more adversarial inputs than the adversary-oriented search and not efficiently measure the robustness of models as they reported.

The main difference between these works and ours lies in that, we primarily use the coverage as guidance directly to obtain the adversarial inputs, where the coverage metrics in other works are always the indicator of the effectiveness of their approaches.

Adversarial attacks for recurrent neural networks. Due to the effectiveness of RNNs[17] on the tasks like speech recognition and natural language processing, adversarial attacks are also applied to RNNs to evaluate their robustness. Besides the works[23, 40, 28, 34] adopting the strategies to add, delete or substitute a word/character to construct the adversarial inputs, some methods[40] replace the targeted word with its synonym. Other approaches[42] restrict the directions of perturbations toward the existing words in the input embedding space. In summary, these works are effective but limited to the classification scenarios.

There are few works evaluating the tasks processing sequential outputs. The work[36] first explains the definition of adversarial inputs for RNNs with categorical outputs and sequential outputs, but just presents rough qualitative descriptions that adversarial inputs could result in the change of outputs for evaluations on sequential outputs. Another work TensorFuzz[34] produced adversarial inputs to lead the language model to sample words from blacklist, which is not even specified in the paper. For state-of-the-art adversarial attacks[6] for speech recognition, the perturbations obtained could be applied to the audio waves in a similar way to the images, but still face several unsettled issues[6]. The testing works for sequential outputs, especially texts, are inadequate and leave threats for majority application scenarios with sequential outputs.

Vii Conclusions

We design and implement a generic adversarial testing framework RNN-Test for recurrent neural networks, integrating diverse adversary orientations and coverage metrics customized for RNNs with the support of free combinations. RNN-Test focuses on the main sequential contexts without limit to the classification tasks and first leverages coverage guidance to directly obtain adversarial inputs. For evaluation, RNN-Test effectively generated adversarial inputs to increase the test perplexity of the PTB language model by 58.11% on average, and caused the spell checker model not correcting the mistakes with the success rate of 73.44% averagely. Finally, the adversarial inputs can be employed to retrain the PTB model and decrease its valid perplexity and test perplexity by 1.159% and 12.582% respectively.


  • [1] (2017-June.)(Website) External Links: Link Cited by: §I, §IV-A.
  • [2] D. Amodei, S. Ananthanarayanan, R. Anubhai, J. Bai, E. Battenberg, C. Case, J. Casper, B. Catanzaro, Q. Cheng, G. Chen, J. Chen, J. Chen, Z. Chen, M. Chrzanowski, A. Coates, G. Diamos, K. Ding, N. Du, E. Elsen, J. Engel, W. Fang, L. Fan, C. Fougner, L. Gao, C. Gong, A. Hannun, T. Han, L. Johannes, B. Jiang, C. Ju, B. Jun, P. LeGresley, L. Lin, J. Liu, Y. Liu, W. Li, X. Li, D. Ma, S. Narang, A. Ng, S. Ozair, Y. Peng, R. Prenger, S. Qian, Z. Quan, J. Raiman, V. Rao, S. Satheesh, D. Seetapun, S. Sengupta, K. Srinet, A. Sriram, H. Tang, L. Tang, C. Wang, J. Wang, K. Wang, Y. Wang, Z. Wang, Z. Wang, S. Wu, L. Wei, B. Xiao, W. Xie, Y. Xie, D. Yogatama, B. Yuan, J. Zhan, and Z. Zhu (2016-20–22 Jun) Deep speech 2 : end-to-end speech recognition in english and mandarin. In

    Proceedings of The 33rd International Conference on Machine Learning

    , M. F. Balcan and K. Q. Weinberger (Eds.),
    Proceedings of Machine Learning Research, Vol. 48, New York, New York, USA, pp. 173–182. External Links: Link Cited by: §I.
  • [3] D. Bahdanau, K. Cho, and Y. Bengio (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. Cited by: §IV-A.
  • [4] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al. (2016) End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316. Cited by: §I.
  • [5] N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 38th IEEE Symposium on Security and Privacy (SP), pp. 39–57. Cited by: §III-E, §VI.
  • [6] N. Carlini and D. Wagner (2018) Audio adversarial examples: targeted attacks on speech-to-text. arXiv preprint arXiv:1801.01944. Cited by: §VI.
  • [7] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078. Cited by: §II-B.
  • [8] R. Collobert and J. Weston (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, New York, NY, USA, pp. 160–167. External Links: ISBN 978-1-60558-205-4, Link, Document Cited by: §I.
  • [9] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li (2018, Spotlight) Boosting adversarial attacks with momentum. In

    Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Cited by: §VI.
  • [10] X. Du, X. Xie, Y. Li, L. Ma, J. Zhao, and Y. Liu (2018) DeepCruiser: automated guided testing for stateful deep learning systems. arXiv preprint arXiv:1812.05339. Cited by: §I.
  • [11] G. K. Dziugaite, Z. Ghahramani, and D. M. Roy (2016) A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853. Cited by: §VI.
  • [12] (2016-Apr.)(Website) External Links: Link Cited by: §IV-A.
  • [13] I. J. Goodfellow, J. Shlens, and C. Szegedy (2015) Explaining and harnessing adversarial examples. Computer Science. Cited by: §I, §I, §III-A, §III-D, §III-E, §III-E, §VI.
  • [14] A. Graves, A. Mohamed, and G. Hinton (2013) Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pp. 6645–6649. Cited by: §I.
  • [15] C. Guo, M. Rana, M. Cisse, and L. van der Maaten (2017) Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117. Cited by: §VI.
  • [16] J. Guo, Y. Jiang, Y. Zhao, Q. Chen, and J. Sun (2018) DLFuzz: differential fuzzing testing of deep learning systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 739–743. Cited by: §I, §III-A, §III-E, §IV-D.
  • [17] S. Haykin (1994) Neural networks. Vol. 2, Prentice hall New York. Cited by: §VI.
  • [18] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §I.
  • [19] G. Hinton, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, B. Kingsbury, and T. Sainath (2012-11) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine 29, pp. 82–97. External Links: Link Cited by: §I.
  • [20] G. Hinton, O. Vinyals, and J. Dean (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531. Cited by: §VI.
  • [21] S. Hochreiter and J. Schmidhuber (1997) Long short-term memory. Neural Computation 9 (8), pp. 1735–1780. Cited by: §II-B.
  • [22] A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) ImageNet classification with deep convolutional neural networks. In International Conference on Neural Information Processing Systems, Cited by: §I.
  • [23] V. Kuleshov, S. Thakoor, T. Lau, and S. Ermon (2018) Adversarial examples for natural language classification problems. Cited by: §I, §VI.
  • [24] A. Kurakin, I. Goodfellow, and S. Bengio (2017) Adversarial examples in the physical world. In Proceedings of the 2nd International Conference on Learning Representations, Cited by: §VI.
  • [25] Y. LeCun, Y. Bengio, and G. Hinton (2015) Deep learning. nature 521 (7553), pp. 436. Cited by: §III-E.
  • [26] Y. LeCun, Y. Bengio, et al. (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361 (10), pp. 1995. Cited by: §I.
  • [27] Z. Li, X. Ma, C. Xu, and C. Cao (2019) Structural coverage criteria for neural networks could be misleading. Cited by: §IV-C, §VI.
  • [28] B. Liang, H. Li, M. Su, P. Bian, X. Li, and W. Shi (2017) Deep text classification can be fooled. arXiv preprint arXiv:1704.08006. Cited by: §VI.
  • [29] L. Ma, F. Juefei-Xu, F. Zhang, J. Sun, M. Xue, B. Li, C. Chen, T. Su, L. Li, Y. Liu, et al. (2018) Deepgauge: multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 120–131. Cited by: §I, §III-B, §VI.
  • [30] L. Ma, F. Zhang, M. Xue, B. Li, Y. Liu, J. Zhao, and Y. Wang (2018) Combinatorial testing for deep learning systems. arXiv preprint arXiv:1806.07723. Cited by: §VI.
  • [31] L. v. d. Maaten and G. Hinton (2008) Visualizing data using t-sne. Journal of machine learning research 9 (Nov), pp. 2579–2605. Cited by: §IV-C.
  • [32] T. Mikolov, S. Kombrink, L. Burget, J. Černockỳ, and S. Khudanpur (2011) Extensions of recurrent neural network language model. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528–5531. Cited by: §I.
  • [33] S. M. Moosavi Dezfooli, A. Fawzi, and P. Frossard (2016) Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §I, §III-D, §VI.
  • [34] A. Odena and I. Goodfellow (2018) Tensorfuzz: debugging neural networks with coverage-guided fuzzing. arXiv preprint arXiv:1807.10875. Cited by: §I, §VI, §VI.
  • [35] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami (2016) The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, pp. 372–387. Cited by: §I, §VI.
  • [36] N. Papernot, P. McDaniel, A. Swami, and R. Harang (2016) Crafting adversarial input sequences for recurrent neural networks. In Military Communications Conference, MILCOM 2016-2016 IEEE, pp. 49–54. Cited by: §I, §VI.
  • [37] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. Cited by: §VI.
  • [38] K. Pei, Y. Cao, J. Yang, and S. Jana (2017) Deepxplore: automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles, pp. 1–18. Cited by: §I, §I, §IV-D, §VI, §VI.
  • [39] P. Rodriguez, J. Wiles, and J. L. Elman (1999) A recurrent neural network that learns to count. Connection Science 11 (1), pp. 5–40. Cited by: §I.
  • [40] S. Samanta and S. Mehta (2017) Towards crafting text adversarial samples. arXiv preprint arXiv:1707.02812. Cited by: §I, §VI.
  • [41] S. Sankaranarayanan, A. Jain, R. Chellappa, and S. N. Lim (2018) Regularizing deep networks using efficient layerwise adversarial training. In Thirty-Second AAAI Conference on Artificial Intelligence, Cited by: §VI.
  • [42] M. Sato, J. Suzuki, H. Shindo, and Y. Matsumoto (2018) Interpretable adversarial perturbation in input embedding space for text. arXiv preprint arXiv:1805.02917. Cited by: §I, §VI.
  • [43] D. Shen, G. Wu, and H. Suk (2017) Deep learning in medical image analysis. Annual review of biomedical engineering 19, pp. 221–248. Cited by: §I.
  • [44] K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Cited by: §I.
  • [45] J. Su, D. V. Vargas, and S. Kouichi (2017) One pixel attack for fooling deep neural networks. arXiv preprint arXiv:1710.08864. Cited by: §VI.
  • [46] Y. Sun, X. Huang, and D. Kroening (2018) Testing deep neural networks. arXiv preprint arXiv:1803.04792. Cited by: §I, §VI.
  • [47] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna (2016-06) Rethinking the inception architecture for computer vision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §I.
  • [48] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §I, §VI.
  • [49] (2006)(Website) External Links: Link Cited by: §IV-A.
  • [50] (2016-Sep.)(Website) External Links: Link Cited by: §I, §IV-A.
  • [51] W. Zaremba, I. Sutskever, and O. Vinyals (2014) Recurrent neural network regularization. arXiv preprint arXiv:1409.2329. Cited by: §II-B, §IV-A.