1 Introduction
As the adoption of machine learning models to everyday tasks grows rapidly, so does the need to consider the ethical, moral, and social consequences of the decisions made by such models. Several important questions arise in a variety of applications, such as the following: 1) how did the model predict what it predicted?, 2) if a person got an unfavorable outcome from the model, what can they do to change that?, 3) has the model been unfair to a particular group?, and 4) how easily can the model be fooled? Researchers are actively building separate approaches to answer each of these questions.
Method  Blackbox  ModelAgnostic  Mixeddata  Explainability  Fairness  Robustness 

CERTIFAI  
[Ustun et al.2019]  
[Wachter et al.2017]  
[Russell2019]  
[Ribeiro et al.2016]  
[Guidotti et al.2018a]  
[Carlini and Wagner2017]  
[Weng et al.2018] 
One promising vein of research in explainability, first introduced by [Wachter et al.2017], is generating counterfactuals. Given an input data point and a blackbox machine learning model (i.e. we only have access to the model’s prediction for any input), a counterfactual is defined as a generated data point that is as close to the input data point as possible but for which the model gives a different outcome. For example, if a user was denied a loan by a machine learning model, an example counterfactual explanation could be: “Had your income been $5000 greater per year and your credit score been 30 points higher, your loan would be approved.” [Wachter et al.2017] argue that counterfactuals are a way of explaining model results to users such that they can identify actionable ways of changing their behaviors to obtain favorable predictions. In addition to providing counterfactuals for explainability, we show how counterfactual explanations can be used to audit fairness and robustness of a model.
As promising as the original method [Wachter et al.2017] and subsequent methods of generating counterfactuals [Ustun et al.2019, Russell2019]
are, they are also limited in that some only work for linear models, while others cannot deal with different data types. To resolve these limitations, we introduce CERTIFAI, a novel, flexible, modelagnostic technique for generating counterfactuals via a custom genetic algorithm. The metaheuristic evolutionary algorithm starts by generating a random set of points such that they do not have the same prediction as the input point. A subsequent evolutionary process results in a set of points close to the input that maintain the prediction constraint. Figure
1 shows an example of three counterfactuals (green points) generated for a given input (black point).A major advantage of using the genetic algorithm to generate counterfactuals is that it can generate counterfactuals for linear and nonlinear models (e.g. deep networks) and for any input form (from mixed tabular data to image data) without any approximations to or assumptions for the model. Moreover, endusers can 1) define a range for any feature, and 2) restrict the features that can change. CERTIFAI simply constrains the values of the sampled points based on those choices, allowing the generated counterfactuals to reflect a user’s understanding of how much it is possible for them to change their features.
CERTIFAI can be used to audit any blackbox classifier, providing three tools that are based on a single underlying algorithm. The major contributions of this paper are summarized as follows:

Counterfactuals are generated using a custom genetic algorithm, which is modelagnostic, flexible, and can be used to provide explanations.

Counterfactuals are shown to be effective adversarial examples. They are also used to generate the Counterfactual Explanationbased robustness scores (CERScore), which to the best of our knowledge is the first ever blackbox model robustness score.

Counterfactuals can be used to evaluate fairness with respect to a user’s input as well as the fairness of the model towards groups of individuals.
2 Related Work
Table 1 summarizes the key features of CERTIFAI and the work most related to CERTIFAI. Other general methods on explainability, fairness and robustness have been described by [Guidotti et al.2018b],[Binns2017], and [Akhtar and Mian2018] respectively. As we can see, there is no prior art that handles a diverse set of desirable properties needed to develop a responsible AI system. [Wachter et al.2017] introduced counterfactual explanations, however, their optimization formulation cannot handle categorical data (i.e. the optimization does not solve for such data; the values are found in a bruteforce fashion). Methods in [Ustun et al.2019] and [Russell2019] only work for linear models. [Guidotti et al.2018a]
uses a genetic algorithm to generate neighbors around the input and then use decision trees to locally approximate the model. However, local approximations might be at the cost of model accuracy, and they define counterfactuals based on the minimum number of splits in the trained decision tree, which might not always be the smallest change to the input. The closest work to generating adversarial examples is by
[Carlini and Wagner2017]. They work on a whitebox and simpler convolution models, and the distance metrics might not be apt to measure image similarity.
[Wang et al.2002]. [Weng et al.2018] define a robustness score CLEVER. However, they have access to the model gradients.3 The CERTIFAI framework
In this section, we formulate a custom genetic algorithm to find counterfactual(s). Consider a blackbox classifier f and an input instance x. Let the counterfactual be a feasible generated point c. Then the problem can be formulated as:
(1)  
where is the distance between and . To avoid using any approximations to or assumptions for the model, we use a genetic algorithm to solve Equation 1. The custom genetic algorithm works for any blackbox model and input data type, and it is modelagnostic. Additionally, it provides a great deal of flexibility in counterfactual generation.
CERTIFAI’s genetic algorithm solves the optimization problem in Equation 1 through a process of natural selection. The only mandatory inputs for the genetic algorithm are the blackbox classifier and an input instance x. Generally, for an
dimensional input vector
x, let represent the space from which individuals can be generated and be the set of points with the same prediction as x:(2) 
The possible set of individuals are defined such that
(3) 
Each individual is a candidate counterfactual. The goal is to find the fittest possible to x constrained on . The fitness for an individual c is defined as:
(4) 
Here will then be the point closest to x such that . For a multiclass case, if a user wants the counterfactual c to be belong to a particular class , we define as:
The algorithm is carried out as follows: first, a set is built by randomly generating points such that they belong to . Individuals are then evolved through three processes: selection, mutation, and crossovers. Selection chooses individuals that have the best fitness scores (Equation 4). A proportion of these individuals (dependent on
, the probability of mutation) are then subjected to mutation, which involves arbitrarily changing some feature values. A proportion of individuals (dependent on
, the probability of crossover) are then subjected to crossover, which involves randomly interchanging some feature values between individuals. The population is then restricted to the individuals that meet the required constraint (Equation 3 or Equation 6), and the fitness scores of the new individuals are calculated. This is repeated until the maximum number of generations is reached. Finally, the individual(s) with the best fitness score(s) is/are chosen as the desired counterfactual(s)^{2}^{2}2 p=0.2 and p=0.5, which is standard in literature. The population size is the square of the input feature size with a maximum cap of 30,000. Gridsearch is used to find the number of generations.3.1 Choice of distance function
The choice of distance function used in Equation 1 depends on the details provided by the model creator and the type of data being considered. If the data is tabular, [Wachter et al.2017] demonstrated how the norm normalized by the median absolute deviation (MAD) is better than using the or norm for counterfactual generation. For tabular data, the norm for continuous features (NormAbs) and a simple matching distance for categorical features (SimpMat) are chosen as default. In the absence of training data, normalization using MAD is not possible. However in model development and our experimenets where there is access to training data, normalization is possible. The distance metric used is:
(7) 
where and are the number of continuous and categorical features, respectively, and is the total number of features ().
For image data, the Euclidean distance and absolute distance between two images are not good measures of image similarity [Wang et al.2002]. Hence, we use SSIM (Structural Similarity Index Measure) [Wang et al.2003], which has been shown to be a better measure of what humans consider to be similar images [Wang et al.2002]. SSIM values lie between 0 and 1, where a higher SSIM value means that two images look more similar to each other. For the input image x and counterfactual image c, the distance is:
(8) 
3.2 Improving counterfactuals with constraints
Apart from the input instance and blackbox model, additional inputs help the algorithm produce better results. Auxiliary constraints are incorporated by restricting the space defined by the set : the space from which individuals can be generated, to ensure feasible solutions. For an dimensional input, let be the Cartesian product of the sets ,,…,. For continuous features, can be constrained as , and categorical features can be constrained as . However, certain variables might be immutable (e.g., race). In these cases, a feature for an input x can be muted by setting .
An example of the benefits of such constraints would be when a user may not want an explanation of an income change from $10,000 to $900,000 if that is not possible, so might be an appropriate constraint. The number of counterfactuals can also be set. CERTIFAI chooses the top individuals ( as default) where different features have changed, so the enduser can get multiple diverse explanations of different kinds.
3.3 Robustness
Machine learning models are prone to attacks and threats. For example, deep learning models have performed exceedingly well for image recognition tasks, but it has been widely shown
[Carlini and Wagner2017], [Nguyen et al.2015] that these networks are prone to adversarial attacks. Two images may look the same to a human, but when presented to a model, they can produce different outcomes. A counterfactual is a generated point close to an input that changes the prediction and is therefore an adversarial example.Given two blackbox models, if the counterfactuals across classes are farther away from the input instances on average for one network as compared to the other network, that network would be harder to fool. Since CERTIFAI directly gives a measure of distance d(x,c), this can be used to define the robustness score for a classifier. Using this distance, we introduce Counterfactual Explanationbased Robustness Score (CERScore), the first ever blackbox model robustness score. Given a model, the CERScore is defined as the expected distance between the input instances and their corresponding counterfactuals:
(9) 
To be able to better compare models trained on different data sets, the CERScore can be normalized by the expected value of the distance between data points in each class over all classes k, and hence we get the normalized CERScore NCERScore (abbreviated as NC) as:
(10) 
(i.e., we normalize by dividing by the expected distance between two datapoints drawn from the same class). A higher CERScore implies that the model is more robust. Note that the normalized CERScore can be greater than 1. Unlike [Weng et al.2018], CERTIFAI only needs model predictions and not the model internals.
3.4 Fairness
The fitness measure (Equation 4) and CERScore can also be used to investigate fairness from individual and group perspectives, respectively. For a given individual instance, if the genetic algorithm can generate different counterfactuals with different values of a protected feature (e.g., race, age), and as a result the user can achieve the desired outcome more easily than when those features could not be changed, then the individual could claim the model is unfair to their case. Additionally, CERTIFAI can be used by model developers to audit the fairness for different groups of observations. If the fitness measure is markedly different for counterfactuals generated for the different partitions of a feature’s domain value, this could be an indication the model is biased towards one of the partitions. For example, if the gender feature is partitioned into two values (male and female), and the average fitness values of generated counterfactuals are lower for females than for males, this could be used as evidence that the model is not treating females fairly. Using counterfactuals and the distance function, we can calculate the overall burden for a group, measured as:
(11) 
where g is a partition defined by the distinct values for a specified feature set. Note, burden is related to CERScore as it is the expected value over a group. Most fairness auditing models focus on single features (e.g., [Hardt et al.2016, Donini et al.2018]). Burden, however, does not have that limitation and can be applied to any combination of features.
4 Experiments
We demonstrate the applications and flexibility of CERTIFAI to explainability, transparency, fairness, and robustness.
4.1 Robustness
In this section, we demonstrate how CERTIFAI produces adversarial examples, and we use CERScore from Section 3.3 to measure a network’s resistance to adversarial attacks.
4.1.1 Generating Adersarial Examples
We consider the MNIST dataset [LeCun1998]
which contains 60000 (size 28x28) training images of digits. We use it to train a convolutional neural network, consisting of one convolution layer, two dense layers, and intermediate pooling and dropout layers. We achieve a 99.46% accuracy on the test set using this architecture. Then we select random images from the MNIST dataset and the model above and find the counterfactual image for each image, using SSIM (equation (8)). Every pixel is considered to be a feature and hence, every individual in the population is a 784 dimension vector. We use an initial population size of 30,000 individuals and run the experiment for 1,000 generations.
Model  CERScore  CI  CLEVER 

Inceptionv3  1.17  1.091.25  0.229 
Resnet50  1.06  1.051.08  0.137 
MobileNet  1.08  1.061.09  0.151 
Robustness score and 95 percent confidence intervals (CI) for those scores for 3 deep learning models and the corresponding CLEVER scores
Data set  Num.  Num.  DT  SVM  MLP  

obs.  features  NCERS.  Acc.  NCERS.  Acc.  NCERS.  Acc.  
Pima Diabetes  768  8  0.074  73.25  0.387  81.42  0.486  98.61 
Breast Cancer  569  32  0.081  95.80  0.121  96.50  0.124  96.50 
Iris  150  4  0.132  95.67  0.235  95.67  0.241  95.67 
Descriptions of data sets, and NCERScore (NCERS.) and test set accuracy (Acc.) for three models: decision tree (DT), SVM with RBF kernel (SVM), and Multilayer Perceptron (MLP).
Figure 2 shows an example of ten generated counterfactuals (right) and their original counterpart (left) images. The counterfactual images on the right look nearly identical to the input images on the left, however, the model predicts a different outcome for the images on the left and right. The imperceptibly different images give credence to the idea of using a genetic algorithm formulation to produce counterfactuals. Additionally, our approach towards generating these images is modelagnostic and does not require any approximations, unlike [Carlini and Wagner2017]
. The generated images show how a network can easily be fooled and demonstrate that there is a major problem in deploying such highlyaccurate networks to imagebased decision making applications (eg. face recognition). Moreover, different kinds of adversarial attacks can be generated by simply changing the distance function in Equation
4.4.1.2 Evaluating Deep Networks
In this section, we evaluate how well CERScore, introduced in Section 3.3, can give an informative measure of robustness. We consider the same networks as in [Weng et al.2018]: Inceptionv3 [Szegedy et al.2016] , ResNet50 [He et al.2016] and MobileNet [Howard et al.2017]
pretrained on ImageNet
[Deng et al.2009], where they define the CLEVER score for robustness. Unlike CLEVER, we consider the model to be a blackbox (only relying on its predictions). Ideally, to derive a measure of robustness for a model, all images from all classes should be considered, their counterfactuals should be generated, and the CERScore should then be calculated. However, since the number of training samples for a deep network is in the order of millions, it is not computationally feasible to calculate the score for each example. Hence, we consider a subset of classes and images to calculate the CERScore. We sampled n=50 random images from every class across k=100 random classes. We generate the counterfactuals for all 5,000 images such that the counterfactual gives a prediction of the second most likely class (by generating individuals constrained on belonging to that class as in Equation 6) and empirically estimate the CERScore as:
(12) 
where is the input instance belonging to predicted class , and is the corresponding counterfactual. The CERScores are shown in Table 2
. One way to interpret the score is that on average, the SSIM score for Inceptionv3 is 1/1.17 = 0.85, where an SSIM score of 1 means the images look exactly the same and an SSIM score of 0 means the images are highly different. Hence, adversarial attacks for Inceptionv3 could be more easily identified than for the other models. We also show the 95% confidence interval where we have assumed the distribution of distances between the images and their counterfactuals follows a normal distribution. The confidence intervals are tight around the CERScores.
We also compare CERScore with CLEVER scores [Weng et al.2018] for the same images, considering the top2 class attack. The CLEVER scores are also reported in Table 2. The CERScore implies that Inceptionv3 is most robust and Resnet50 is least robust, which is similar to what the CLEVER scores suggest. Hence, even though CERTIFAI does not access any model weights, it is able to evaluate a model’s robustness to adversarial attacks.
4.1.3 Robustness of Classic Classifiers
Next, we use NCERScore (Equation 10
) to compare the robustness of different models trained on different data sets. We train three models (decision trees (DT), Support Vector Machines with RBF kernel (SVM), and multilayer perceptrons (MLP)) on the three data sets listed in Table
3. We report the NCERScore and the accuracy on the test set in Table 3. Across all data sets, the neural network has the highest NCERScore and is therefore the most robust of the classifiers for these data sets. In the Pima diabetes data set, the accuracy of the decision tree is much lower than the other models, which suggests this simple model cannot adequately capture the class separation. Hence, more points would be concentrated near the decision boundaries, resulting in a lower NCERScore.For the Iris data set, while it is a relatively simple data set (even the decision tree performs well), the decision tree has the lowest NCERScore while the scores for SVM and MLP are similar. In Figure 3, we plot input points for two features of the Iris data set and the decision boundary for each model. Looking closely, the points for the decision tree are closer on average to the decision boundaries as compared to the other two models (i.e. the densely clustered white and red points are closer to decision boundaries), which suggests the model is more prone to being fooled. The decision boundaries for both SVM and MLP are nearly identical around the input points, which results in similar robustness scores. Similar results can be seen for the cancer data set. Using these results, a model developer can choose the most robust model based on the NCERScore.
Person  Feature(s)  Original  Counterfactual 

1  Glucose (CWC)  115  71 
BMI (CUC)  35.3  10.1  
2  Glucose (CWC)  168  89 
Age (CUC)  34  44 
4.2 Explainability
Person  Feature(s)  Original  Counterfactual 

1  Education  12th  Bachelors 
Occupation  Techsuppt  Execmanagerial  
1  Hrsperweek  50  70 
Workclass  Localgov  Private 
Counterfactuals are used to provide explanations and transparency to a user on how much change is needed for them to obtain a favorable prediction. We show the importance of using the constraints to improve explanations, the use of multiple counterfactual explanations for a single instance, and how these can also be used to estimate feature importance.
4.2.1 Datasets and Models
We use the Pima Indian dataset [Smith et al.1988] and the UCI adult dataset [Kohavi1996]
in the following experiments. The Pima Indian dataset consists of 768 data samples and 8 features where 6 features are continuous and integervalued, and 2 features are continuous floatvalued. The task is to predict the risk of diabetes (1: At risk, 0:Not at risk). We train a 4 layer neural network with an input layer, 2 hidden layers of 20 neurons each, and an output layer with a 8020 trainingtest split. The accuracy of the model is 99.6% on the test set. An initial population of 500 individuals is considered and the evaluation is done across 300 generations. The UCI adult dataset consists of 48842 samples with 14 categorical and continuous integer features and a binary outcome of predicted income (
50k or50k). Since the dataset contains many categorical variables, finding a counterfactual using
[Wachter et al.2017] would not be feasible. We train a 6 layer neural network with an input layer, 4 hidden layers of 80 neurons each, and an output layer with a 8020 trainingtest split. The accuracy of the model is 99.20% on the test set. Since the dataset is larger, an initial random population of 1000 individuals is considered and the evaluation is done across 500 generations. The negative outcome is considered to be income 50k, and we find the counterfactuals for those. We only consider those input instances where the model prediction matches the groundtruth.4.2.2 Importance of Constraints
We consider two cases of counterfactual generation, counterfactuals with constraints (CWC) and counterfactuals unconstrained (CUC) for users with a prediction of high diabetes risk. CWC corresponds to a user or model creator providing a range of values for features. CUC corresponds to a user only providing the blackbox model and the input instance without any constraints on the feature values. We show features for which the values have changed (between the input and counterfactual), all other values remained constant.
As shown in Table 4, for person 1, when we provide constraints (CWC), the explanation is: Had your glucose been less by 34, you wouldn’t have been at the risk of diabetes. All other feature values for the user remained constant. Without constraints, the explanation shows that the BMI would have to be decreased to 10.1. While this is a smaller change in magnitude as compared to changing the glucose level, achieving a BMI of 10.1 is not feasible, and hence it is important to use the flexibility of our approach to add additional constraints that ensure feasibility. Similarly, for person 2, the age is suggested to be changed, which is not feasible.
4.2.3 Measuring feature importance
From a model developer’s perspective, counterfactuals can show the importance of every feature value to the prediction and hence provide transparency. If CERTIFAI is changing a particular feature more often than another feature when comparing the input and counterfactual, that feature is more significant for a model. For the Pima Indian diabetes dataset, we generate counterfactuals for all samples (irrespective of prediction) and analyze the number of times every feature value has changed, as shown in Figure 4. Interestingly, the importances are qualitatively similar to those returned by Python’s XGBoost [Chen and Guestrin2016] library (also shown in Figure 4). Specifically, feature 5 (BMI) and feature 2 (Glucose) are the most important in predicting diabetes risk. This analysis can be extended to the multiclass case by constraining sampled individuals such that they belong to a desired class (Equation 6)
Person  Feature  FitnessM  FitnessU 

1  Race  0.63  0.87 
2  Gender  0.41  0.62 
3  Race  0.81  0.81 
4.2.4 Multiple counterfactual explanations
Multiple explanations are helpful to a user so that they can receive a diverse set of changes that could be made to achieve a desired outcome. The UCI adult dataset (CWC case) is considered and features such as nativecountry are muted and a set range is given for features like hoursperweek. We run the genetic algorithm for the input instance and select the best two individuals that have different changes in feature indices. The advantage of our approach is that we only need to run the algorithm once, and we can generate many explanations, as opposed to [Russell2019] where the IP solver needs to be run multiple times to generate multiple explanations.
To underscore the benefits of suggesting alternative counterfactuals, Table 5 shows two sets of explanations that are generated by CERTIFAI for the same person. Multiple explanations, the number of which is set by the user, allow a user to decide which counterfactual may be the most actionable.
4.3 Fairness
We evaluate fairness from an individual’s perspective and from a model developer’s perspective. To see if the model is unfair towards any instance, we consider 100 random instances of the UCI adult dataset where the prediction was unfavorable and run the algorithm twice, once when the sensitive attribute is not allowed to change and once when it is, and record the fitness values. We do this for two sensitive attributes, race and gender.
The results for three such instances are shown in Table 6. FitnessM refers to the fitness value when the race feature is muted for an individual and FitnessU corresponds to the feature being unmuted. The fitness for the first 2 people increases substantially when these protected features are allowed to change and hence for these instances, there is evidence that the model has not been fair. For the third person, the evidence suggests that the model has been fair.
A model developer can use the idea of burden (Equation 11) to evaluate how fair a model is being to groups of individuals. To demonstrate the idea of burden, we consider the attribute race in the UCI adult dataset and take all training examples that have an unfavorable outcome. Results of our experiments are shown in Figure 5. As we can see, the burden on Black race and the Other race is more than the other races. This means that on average, these groups would have to make more changes to achieve a desired prediction as compared to others. Hence the model imposes a burden on these groups, which could imply that the model has been unfair.
5 Conclusion and Future Work
In this paper, we introduced CERTIFAI, a modelagnostic, flexible, and userfriendly technique that helps build responsible artificial intelligence systems. We demonstrate the flexibility that the genetic algorithm brings to provide feasible counterfacual explanations to a user. We show how individual and group fairness can be measured using the fitness values obtained during counterfactual generation. Finally, we show how these counterfactuals are effective adversarial examples and we define CERScore, the first ever measure of robustness for a blackbox model. We are currently developing the UserInterface to CERTIFAI. Future work involves speeding up the genetic algorithm by techniques like [Harik et al.1999] and [Mitchell et al.1994] A comparison between the introduced fairness metric and previous metrics would also be useful. It would also be interesting to see how our adversarial examples perform with strategies [Papernot et al.2016], [Madry et al.2017] that are aimed to handle adversarial attacks.
References

[Akhtar and Mian2018]
Naveed Akhtar and Ajmal Mian.
Threat of adversarial attacks on deep learning in computer vision: A survey.
IEEE Access, 6:14410–14430, 2018.  [Binns2017] Reuben Binns. Fairness in machine learning: Lessons from political philosophy. arXiv preprint arXiv:1712.03586, 2017.
 [Carlini and Wagner2017] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks, 2017.
 [Chen and Guestrin2016] Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system, 2016.
 [Deng et al.2009] Jia Deng, Wei Dong, Richard Socher, LiJia Li, Kai Li, and Li FeiFei. Imagenet: A largescale hierarchical image database. 2009.
 [Donini et al.2018] Michele Donini, Luca Oneto, Shai BenDavid, John S ShaweTaylor, and Massimiliano Pontil. Empirical risk minimization under fairness constraints, 2018.
 [Guidotti et al.2018a] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti. Local rulebased explanations of black box decision systems. arXiv preprint arXiv:1805.10820, 2018.
 [Guidotti et al.2018b] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models. ACM computing surveys (CSUR), 51(5):93, 2018.

[Hardt et al.2016]
Moritz Hardt, Eric Price, Nati Srebro, et al.
Equality of opportunity in supervised learning, 2016.

[Harik et al.1999]
Georges R Harik, Fernando G Lobo, and David E Goldberg.
The compact genetic algorithm.
IEEE transactions on evolutionary computation
, 3(4):287–297, 1999.  [He et al.2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2016.
 [Howard et al.2017] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.

[Kohavi1996]
Ron Kohavi.
Scaling up the accuracy of naivebayes classifiers: A decisiontree hybrid., 1996.

[LeCun1998]
Yann LeCun.
The mnist database of handwritten digits.
http://yann. lecun. com/exdb/mnist/, 1998.  [Madry et al.2017] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
 [Mitchell et al.1994] Melanie Mitchell, John H Holland, and Stephanie Forrest. When will a genetic algorithm outperform hill climbing, 1994.
 [Nguyen et al.2015] Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images, 2015.
 [Papernot et al.2016] Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks, 2016.
 [Ribeiro et al.2016] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should i trust you?: Explaining the predictions of any classifier, 2016.
 [Russell2019] Chris Russell. Efficient search for diverse coherent explanations, 2019.
 [Smith et al.1988] Jack W Smith, JE Everhart, WC Dickson, WC Knowler, and RS Johannes. Using the adap learning algorithm to forecast the onset of diabetes mellitus, 1988.
 [Szegedy et al.2016] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision, 2016.
 [Ustun et al.2019] Berk Ustun, Alexander Spangher, and Yang Liu. Actionable recourse in linear classification, 2019.
 [Wachter et al.2017] Sandra Wachter, Brent Mittelstadt, and Chris Russell. Counterfactual explanations without opening the black box: automated decisions and the gdpr. Harvard Journal of Law & Technology, 31(2):2018, 2017.
 [Wang et al.2002] Zhou Wang, Alan C Bovik, and Ligang Lu. Why is image quality assessment so difficult?, 2002.
 [Wang et al.2003] Zhou Wang, Eero P Simoncelli, and Alan C Bovik. Multiscale structural similarity for image quality assessment, 2003.
 [Weng et al.2018] TsuiWei Weng, Huan Zhang, PinYu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, ChoJui Hsieh, and Luca Daniel. Evaluating the robustness of neural networks: An extreme value theory approach. arXiv preprint arXiv:1801.10578, 2018.
Comments
There are no comments yet.