In recent years the interest in transparent, interpretable and explainable models in machine learning has grown dramatically, with dedicated workshops at NIPS 2016(Wilson et al., 2016), NIPS 2017 (Tosi et al., 2017; Wilson et al., 2017) and ICML 2017 (Varshney et al., 2017) as well as attention from grant agencies (Gunning, 2016).
The approaches to interpretable models go in several distinct directions – producing sparse models (Hara & Maehara, 2016; Wisdom et al., 2016; Hayete et al., 2016; Tansey et al., 2017), visualization techniques (Smilkov et al., 2016; Selvaraju et al., 2016; Thiagarajan et al., 2016; Gallego-Ortiz & Martel, 2016; Krause et al., 2016; Zrihem et al., 2016; Handler et al., 2016), hybrid models (Krakovna & Doshi-Velez, 2016; Reing et al., 2016), input data segmentation (Samek et al., 2016; Hechtlinger, 2016; Thiagarajan et al., 2016), and model diagnostics with or without blackbox interpretation layers (Lundberg & Lee, 2016; Vidovic et al., 2016; Whitmore et al., 2016; Ribeiro et al., 2016b; Singh et al., 2016; Phillips et al., 2017; Ribeiro et al., 2016a, c) to name a few prominent directions.
In this paper, we present a method, Fibres of Failure that draws on topological data analysis to produce model diagnostics through a classification of prediction failure modes in feature space. Our method relates to both the input data segmentation and the model diagnostics directions of research by finding and classifying input regions that behave unexpectedly or erroneously as compared to what the model is designed to predict.
Noisy input as well as adversarial learning has been used to motivate and to generate examples and insights for interpretability (Kindermans et al., 2016). We will use the same basic idea to illustrate our method – by studying prediction failures on MNIST images with added noise.
2 Related work
One interpretability method with a large impact on the field, LIME (Ribeiro et al., 2016c), inspects single instances by perturbing the input and tracing how predictions change with the perturbation. Other interpretability methods focus closer on aggregates of inputs, such as TreeView (Thiagarajan et al., 2016)
, which visualizes deep neural networks by first clustering neurons by activation patterns, then clusters these groups by prediction labels, and finally trains a predictor to predict the meta-clusters from the input data directly.
The FiFa method builds on Mapper, an algorithm from Topological Data Analysis that constructs a graph (or simplicial complex) model of arbitrary data. Mapper has had success in a wide range of application areas, from medical research studying cancer, diabetes, asthma and many more topics (Nicolau et al., 2011; Li et al., 2015; Hinks et al., 2016; Schneider et al., 2016), genetics and phenotype studies (Romano et al., 2014; Carlsson, 2017; Cámara, 2017; Savir et al., 2017; Bowman et al., 2008), to hyperspectral imaging, material science, sports and politics (Duponchel, 2018a, b; Lee et al., 2017; Lum et al., 2013). Of note for our approach are in particular the contributions on cancer, diabetes and fragile X syndrom (Nicolau et al., 2011; Romano et al., 2014; Li et al., 2015) where Mapper was used to extract new subgroups from a segmentation of the input space.
Our results build on two fundamental concepts: viewing predictive models as functions and therefore usable as input to Mapper, and the Mapper technique for producing intrinsic graph models of arbitrary data sets.
As a running illustration in this paper we will be looking at how a CNN trained on the MNIST dataset fails when encountering noisy images derived from MNIST. The influence of noise on learning algorithm performance has been studied. (Zhou et al., 2017; Dodge & Karam, 2016) found a dramatic increase in error rates with increased image distortion, confirming our choice of illustrative test case. Adversarial learning is another method that has been proven successful at deteriorating performance for trained networks (Cisse et al., 2017a; Yuan et al., 2018; Moosavi-Dezfooli et al., 2016; Chen et al., 2018).
3 Proposed method
The proposed method, Fibres of Failure (FiFa)
, takes a different approach from the related work. We do not intend to modify deep neural network models, rather we create classifiers on top of the model that recognizes specific types of faulty predictions (failure modes) from a deep learning model trained to recognize MNIST images.
Mapper (Singh et al., 2007) is an algorithm that constructs a graph (more generally a simplicial complex) model for a point cloud data set. The graph is constructed systematically from some well defined input data. It was defined in (Singh et al., 2007), and has been shown to have great utility in the study of various kinds of data sets (as described in Section 2
). It can be viewed as a method of unsupervised analysis of data, in the same way as principal component analysis, multidimensional scaling, and projection pursuit can, but it is more flexible than any of these methods. Comparisons of the method with standard methods in the context of hyperspectral imaging have been documented in(Duponchel, 2018a, b).
In topological language, Mapper starts with the choice of a collection of continuous filter functions and an open cover over their range. The fibres, or preimages of this open cover produces an open cover on the data space, which can be refined using connected components. Doing this with a fine enough cover and non-degenerate filter functions produces a good cover in the sense of the nerve lemma (Hatcher, 2002), so the nerve complex is homotopy equivalent with the data source.
An open cover here is almost, but not quite the same thing as a partition. In order to track connectivity information, the partition cannot be allowed to become disconnected – that would miss parts of the space, and introduce artificial disconnects. The open cover most cleanly translates into a “fattened” partition, or a partition with overlaps between adjacent parts.
In more detail, and using a more data-focused and less topological description, Mapper proceeds by the following steps. We let (the dataset) be a finite metric space.
Select arbitrary functions . We call these filter functions and they encode a separation of datapoints. In practice, the number is usually or
. Common filter functions are statistically meaningful quantities such as the values of a density estimator or centrality measure, or outputs from a machine learning algorithm such as PCA or MDS, or a variable used in defining the data set.
For each of the functions, pick parameters to produce an overlapping partition of : a number of partitions and a proportion of overlap .
For each function , let and denote the minimum and maximum values taken by , and construct an open cover of the interval by introducing subintervals
where and .
Construct a (likely overlapping) partition of by letting each
with be a part in in the partition.
Construct a graph by setting the vertices to and connecting to with anedge precisely when
For the simplicial complex version, vertices are connected when their joint intersection is non-empty.
Mapper has several implementations available: Python Mapper (Müllner & Babu, 2013), Kepler Mapper (Saul & van Veen, 2017) and TDAmapper111http://cran.r-project.org/web/packages/TDAmapper are all open source, while Ayasdi Inc.222http://ayasdi.com provides a commercial implementation of the algorithm. For our work we are using the Ayasdi implementation of Mapper.
3.1.1 Mapper on prediction failure
The filters in the Mapper function have the effect of ensuring separation of features in the data that are separated by the filter functions themselves. Step one of FiFa specifically uses a Mapper analysis with prediction error as one of the filter functions. By including prediction error this way, the FiFa algorithm guarantees that any groups that are extracted are homogenous with respect to prediction failure, and thus useable as a failure mode designation.
We name a Mapper model with prediction failure as a filter a FiFa model.
3.2 Extract subgroups
Subgroups of the FiFa model with tight connectivity in the graph structure and with homogenous and large average prediction failure per component cluster provide a classification of failure modes. These can be selected either manually, or using a community detection algorithm.
When selecting failure modes manually, a visualization such as in Figure 2 is most helpful. Here, flares (tightly connected subgraphs emanating from a core, such as Group 40) or tightly connected components, loosely connected to surrounding parts of the graph, are the most compelling characterizations of a good failure mode subgroup.
3.3 Quantitative: model correction layer
Once failure modes have been identified, one way to use the identification is to add a correction layer to the predictive process. Use a classifier to recognize input data similar to a known failure mode, and adjust the predictive process output according to the behavior of the failure mode in available training data.
3.3.1 Train classifiers
For our illustrative examples, we demonstrate several “one vs rest” binary classifier ensembles where each classifier is trained to recognize one of the failure modes (extracted subgroups) from the Mapper graph. We demonstrate performance of FiFa
for model correction using Linear SVM, Logistic Regression, and Naïve Bayes classifiers.
3.3.2 Evaluate bias
A classifier trained on a failure mode may well capture larger parts of test data than expected. As long as the space identified as a failure mode has consistent bias, it remains useful for model correction: by evaluating the bias in data captured by a failure mode classifier we can calibrate the correction layer.
3.3.3 Adjust model
The actual correction on new data is a type of ensemble model, and has flexibility on how to reconcile the bias prediction with the original model prediction – or even how to reconcile several bias predictions with each other. For our example in this paper we choose to override the CNN prediction with the observed ground truth in the failure mode from the training data used to create the classifier. For regression tasks we have also used the average of the failure mode training group as an offset to subtract from the model prediction.
3.4 Qualitative: model inspection
Identifying distinct failure modes and giving examples of these is valuable for model inspection and debugging. Statistical methods, such as Kolmogorov-Smirnov testing, can provide measures of how influential any one feature is in distinguishing one group from another and can give notions of what characterizes any one failure mode from other parts of input space. With examples and distinguishing features in hand, we can go back to the original model design and evaluate how to adapt the model to handle the failure modes better.
Much of the work in interpretability for machine learning provides tools to inspect examples, and for providing a model explanation for a specific example. These work well in conjunction with FiFa to find explanations for the identified failure modes.
In order to evaluate the FiFa method we have trained a CNN classifier on the MNIST data set, created prediction failures by adding noise to the data, and gone through the FiFa pipeline for the resulting erroneous predictions. With distinct failure modes extracted, we then illustrate both a quantitative and a qualitative approach to handling the output from FiFa: on the one hand we adjust predictions using classifiers trained on recognizing each failure mode and measure the improvement in classification on the resulting ensemble approach, on the other hand we compare several failure modes that misclassify versions of the same digit (the digit 5) in different ways.
We created a CNN model with a topology shown in Figure 1
. The network topology and parameters was chosen arbitrarily with the only condition that it performs well on the original MNIST data set. The activation functions was ’Softmax’ for the classification layer and ’ReLU’ for all other layers. The optimizer was Adadelta with, , and
. We trained the model on 60,000 clean MNIST training images and tested it on 10,000 clean MNIST images through 12 epochs. The accuracy on the test-set of 10,000 clean MNIST images was 99.05%. We created 10,000 corrupt MNIST images using 25% random binary flips on the clean test images[source for code]. The accuracy on the corrupt MNIST images was 40.45%.
To create the Mapper graph we used the following:
Principal Component 1, probability of Predicted digit, probability of Ground truth digit, and Ground truth digit. Our measure of predictive error is the probability of Ground truth digit. By including the Ground truth digit itself we separate the model on ground truth, guaranteeing that any one one failure mode has a consistent ground truth that can be used for corrections.
Metric:Variance Normalized Euclidean
9472 network activations: all activations after the Dropout layer that finishes the convolutional part in the network and before the softmax layer that provides the final predictions. These are the layers with 9216, 128 and 128 nodes displayed in Figure1.
Instances: We randomly shuffled the data from the 10,000 clean and 10,000 corrupt images that were used to test the CNN model, and split the 20,000 instances into 5 training sets of size 16,000 each and 5 test sets of size 4,000 each. The training sets was used to create 5 Mapper graphs. This is in order to perform 5-fold cross validation on the classifiers.
The use of probabilities for predicted and ground truth digit as filters guarantees that Mapper separates regions of correct predictions from those of wrong predictions. After all, these probabilities are measures of error for the CNN model. We purposely omitted the activations from the Dense-10 layer as input variables because of the direct reference to the probabilities for both the ground truth digit and the predicted digit.
The following variables were included in the analysis but were not used to create the FiFa model:
10 activations from the Dense-10 layer, which consists of the probabilities for each digit, 0-9.
784 pixel values representing the flattened MNIST image of size 28x28x1.
6 variables: prediction by the CNN model, ground truth digit, corrupt or original data (binary), correct or incorrect prediction(binary), probability of the Predicted digit (highest value of the Dense-10 layer), and probability of ground truth digit.
Hence, the total number of variables in our analysis were 10272.
To extract failure modes from the FiFa model we used a supervised community detection method to find groups of approximately constant prediction error. In the Mapper
implementation we are using a grouping method based on Agglomerative Hierarchical Clustering (AHCL)(Edwards & Cavalli-Sforza, 1965; Murtagh & Contreras, 2012) and Louvain Modularity(Blondel et al., 2008) is included. As supervision, a function on the data is chosen – for FiFa, choose the measure of prediction error. The difference in means of the supervision function produces a graph edge weighting: edges are weighted as “strong” if they have similar supervision function values, and “weak” if the supervision function values are different. With the graph weighting in place, hierarchical clustering produces a clustering tree using the weighted edges to generate a graph metric to cluster over. Finally, Louvain modularity identifies an optimal graph partition from the clustering tree.
From partitioned groups, we retain as failure modes those groups that have at least 15 data points and have less than 99.05% correct predictions, which is the accuracy of the CNN model on the original MNIST test data.
We trained classifiers in a one vs. rest scheme on each group in the 5 folds of data that were used to create the 5 Mapper graphs. We used the following types of classifiers with varying parameters shown in square brackets:
Linear-SVMLoss function: squared hinge, Penalty function: . Regularization parameter, .
Logistic Regression Penalty function: . Regularization parameter, .
Naïve Bayes Gaussian Naïve Bayes using class priors for each group in the training data set.
We used the parameters from each best performing classifier to train new models. This time, we evaluated each model on second test data set, called ’Corrupt’, which consisted of 10,000 new corrupt images using 25% binary flips on the original MNIST test dataset. Hence, we used the same noise setup as the corrupt images used for testing the CNN model.
For the test data sets, we evaluated to what extent each classifier predicted member points with the same ground truth digit as that of the group the model was trained on. As we trained the classifiers on groups containing a lot of wrong predictions, it is expected that the classifiers will classify member points with wrong predictions on the test data sets. Hence, we offset the predicted digits with the ground truth digit of the group it was trained on. We attempt to exploit the consistent bias of the classifiers to improve the accuracy of the now combined CNN and classifier ensemble.
4.2 Quantitative Results
The following parameters were chosen for the three classifiers we evaluated as model correction layers:
Linear-SVM . (chosen as highest accuracy in a 5-fold crossvalidation)
Logistic Regression (chosen as highest accuracy in a 5-fold crossvalidation)
Naïve Bayes Gaussian Naïve Bayes using priors induced from data.
The average number of data points in all failure mode groups in the 5 folds were 4937 of the total 16,000. The average number of clean data points in all groups in the 5 folds were 10.4, accounting for a fraction of 0.21% of the 4937 data points. This also means that the failure mode groups encompasses roughly 62% of all corrupt data points in the training set. The number of failure modes (extracted subgroups) in each fold were 41, 41, 41, 41, and 37, respectively.
Table 1 shows the accuracy on the two test data sets using CNN with and without FiFa. The linear-SVM classifiers performed best on both data sets with an improvement by 6.43%pt on the 5-fold cross validation test sets and 19.33%pt on the ’Corrupt data’.
|(4 000)||49.7%||(10 000)|
4.3 Qualitative Results
For the qualitative analysis, we chose to focus on four groups with digit 5 as the ground truth digit. Group 50, which is not one of the failure mode groups and Groups 30, 40, and 47, all part of the total 39 failure mode groups. The locations of each group are shown in Figure 2. The distribution of predicted probabilities for each label is shown in Figure 4: group 30 is the group with highest probability to the digit 5, while 40 and 47 are more focused on 8, 2, and 3. All three groups favor digit 8 as their mean probabilities are between 0.5-0.9.
We compared these three failure modes with the non-failure Group 50 and extracted the 5 activations with the highest KS-values from the Dense-128 layer. See Figure 1. To illustrate the differences between the three failure modes regarding the activations, we have provided a selection of saliency maps (Simonyan et al., 2013) for all images considered as true members of each of the three failure mode groups. These were all produced using the keras-vis Python package.
Figure 3 shows a selection of noisy images and their saliency maps for some of the activations highest KS-values within the Dense-128 layer. The two leftmost image pairs were selected based on visual clear saliency maps with respect to digits. The two rightmost were selected based on most unclear/noisy saliency maps. The full collection of saliency maps for these groups can be found in our supplemental material.
The activations 24 and 81, present in all three groups, display activity that is consistent with an activation detecting features of the digit 5, while the activations 89 and 99 correspond closer to an activation for the digit 3 and 119, 122 and 124 correspond to activations for the digit 8. In particular in the last three groups, noise that closes loops in a written 5 tend to have high saliency.
In Table 2 we show the percentage of blank saliency maps, indicating that an activation is missing completely for a particular input.
5 Discussion and Conclusion
For the quantitative approach to handling failure modes we could see significant improvement even using quite simplistic classifiers for constructing a correction layer: an increase by almost 20%pt, avoiding corrections on almost all uncorrupted images was seen in both the linear separation methods: both with logistic regression and SVM.
On the qualitative side, an inspection of the saliency maps – see Figure 3 for a selection of particularly illustrative maps, and the supplementary material for a full collection – showed us that the groups were distinguished from group 50 containing correctly predicted digits 5 differed in network activity either by an activation tuned to detecting 5s, or in an activation that often looked for closing loops and found them in the added noise. Blank saliency maps were common for the 5-detecting neurons, as can be seen in Table 2, overwhelmingly so for the groups 40 and 47 where correct predictions were rare, and much less commonly in group 30 where as can be seen in Figure 4 a correct prediction still came with significant strength in the softmax layer.
Using FiFa on a CNN-based MNIST digit classifier that had to cope with severely corrupted MNIST images we were able to find 39 distinct failure modes based on activations in the antepenultimate and penultimate layers of our CNN model. When inspecting the digit 5 in particular, we found that the three identified failure modes could be distinguished from the wellbehaved parts of input space by specific activations that seemed to code for features corresponding closely to the kinds of misclassifications that were observed.
In addition to inspecting examples, we explored the addition of a correction layer to the CNN model. The failure modes act as seeds for training a classifier. The classifier can assign new data to a known failure mode, so that the correction layer can adjust for known behaviour of that failure mode. For regression models, our suggestion would be to treat the prediction error as bias, and subtract the mean prediction error for the identified failure mode from the model prediction. In the CNN on corrupted MNIST example we use to illustrate the methodology, we impose the ground truth digit from which the identified failure mode emerged as a replacement prediction. By doing this, we could observe up to a 19.33%pt improvement in prediction accuracy on corrupted data while accidentially including only 0.32% of uncorrupted observations in the correction groups. The percentage of clean data is in well accordance with that in the failure mode groups; 0.21%.
FiFa is generically applicable. While developing the method we have used it to analyze an energy based regression model used to predict temperatures in electric arc steel furnaces. In that application, we found failure modes that consistently over-predicted and under-predicted by close to . Adjusting the regression by the mean prediction error of the failure group provided significant improvement in the energy model and a qualitative analysis of the failure modes uncovered metallurgically important observations about material composition related to high prediction error.
The FiFa method picks out high prediction error regions from input space of an arbitrary predictive process, and classifies failure modes that are internally similar but that have significant separation either in the predictive behaviour of the process or in the distance measure of input space. Having identified failure modes we can view them as witnesses for misbehaviour in different ways, and produce correspondingly different developments of the predictive process. On the one hand, a failure mode witnesses a region of input space with local bias to the predictive process, and we can correct specifically for that bias by classifying new data as belonging to that failure mode (or not) and correct predictions for the failure mode members. On the other hand the failure mode is a witness for some coherent collection of predictive failures. By inspecting features of input space that distinguish these from other parts of input space we can gain insights about types of failure that could be handled by adjusting the design of the predictive process itself.
- Blondel et al. (2008) Blondel, Vincent D, Guillaume, Jean-Loup, Lambiotte, Renaud, and Lefebvre, Etienne. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.
- Bowman et al. (2008) Bowman, G.R., Huang, X., Yao, Y., Sun, J., Carlsson, G., Guibas, L.J., and Pande, V.S. Structural insight into rna hairpin folding intermediates. JACS Communications, pp. 9676–9678, 2008.
- Carlsson (2009) Carlsson, Gunnar. Topology and data. American Mathematical Society, 46(2):255–308, 2009.
- Carlsson (2017) Carlsson, Gunnar. The shape of biomedical data. Current Opinion in Systems Biology, 1, 2017.
Chen et al. (2018)
Chen, Sen, Xue, Minhui, Fan, Lingling, Hao, Shuang, Xu, Lihua, Zhu, Haojin, and
Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach.Computers and Security, 73:326––344, 2018.
- Cisse et al. (2017a) Cisse, Moustapha, Adi, Yossi, Neverova, Natalia, and Keshet, Joseph. Houdini: Fooling deep structured visual and speech recognition models with adversarial examples. In Advances in Neural Information Processing Systems 30, 2017a.
- Cisse et al. (2017b) Cisse, Moustapha, Bojanowski, Piotr, Grave, Edouard, Dauphin, Yann, and Usunier, Nicolas. Parseval networks: Improving robustness to adversarial examples. In Proceedings of the 34th International Conference on Machine Learning, 2017b.
- Cámara (2017) Cámara, Pablo G. Topological methods for genomics: Present and future direction. Current Opinion in Systems Biology, 1:95–101, 2017.
- Dodge & Karam (2016) Dodge, Samuel and Karam, Lina. Understanding how image quality affects deep neural networks. In 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX). IEEE, June 2016.
- Duponchel (2018a) Duponchel, Ludovic. Exploring hyperspectral imaging data sets with topological data analysis. Analytica Chimica Acta, 1000:123–131, 2018a.
- Duponchel (2018b) Duponchel, Ludovic. When remote sensing meets topological data analysis. Journal of Spectral Imaging, 2018b.
Edwards & Cavalli-Sforza (1965)
Edwards, Anthony WF and Cavalli-Sforza, L Luka.
A method for cluster analysis.Biometrics, pp. 362–375, 1965.
- Fawzi et al. (2016) Fawzi, Alhussein, Moosavi-Dezfooli, Seyed-Mohsen, and Frossard, Pascal. Robustness of classifiers: from adversarial to random noise. In Advances in Neural Information Processing Systems 29, 2016.
- Gallego-Ortiz & Martel (2016) Gallego-Ortiz, Cristina and Martel, Anne L. Interpreting extracted rules from ensemble of trees: Application to computer-aided diagnosis of breast MRI. arXiv:1606.08288 [cs, stat], June 2016. URL http://arxiv.org/abs/1606.08288. WHI 2016 (ICML Workshop).
Explainable artificial intelligence (XAI).DARPA Broad Agency Announcement DARPA-BAA-16-53, 2016.
- Handler et al. (2016) Handler, Abram, Blodgett, Su Lin, and O’Connor, Brendan. Visualizing textual models with in-text and word-as-pixel highlighting. arXiv:1606.06352 [cs, stat], June 2016. URL http://arxiv.org/abs/1606.06352. WHI 2016 (ICML Workshop).
- Hara & Maehara (2016) Hara, Satoshi and Maehara, Takanori. Finding Alternate Features in Lasso. arXiv:1611.05940 [stat], November 2016. URL http://arxiv.org/abs/1611.05940. NIPS 2016 InterpretML.
- Hatcher (2002) Hatcher, Allen. Algebraic Topology. Cambridge University Press, 2002.
- Hayete et al. (2016) Hayete, Boris, Valko, Matthew, Greenfield, Alex, and Yan, Raymond. MDL-motivated compression of GLM ensembles increases interpretability and retains predictive power. arXiv:1611.06800 [stat], November 2016. URL http://arxiv.org/abs/1611.06800. NIPS 2016 InterpretML.
- Hechtlinger (2016) Hechtlinger, Yotam. Interpretation of Prediction Models Using the Input Gradient. arXiv:1611.07634 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07634. NIPS 2016 InterpretML.
- Hein & Andriushchenko (2017) Hein, Matthias and Andriushchenko, Maksym. Formal guarantees on the robustness of a classifier against adversarial manipulation. In Advances in Neural Information Processing Systems 30, 2017.
- Hinks et al. (2016) Hinks, T.S., T, Brown, LC, Lau, H, Rupani, C, Barber, S, Elliott, JA, Ward, J, Ono, S, Ohta, K, Izuhara, R, Djukanović, RJ, Kurukulaaratchy, A, Chauhan, and P., Howarth. Multidimensional endotyping in patients with severe asthma reveals inflammatory heterogeneity in matrix metalloproteinases and chitinase 3-like protein 1. J.Allergy Clin Immunol, 138(1), 2016.
- Huang et al. (2015) Huang, Ruitong, Xu, Bing, Schuurmans, Dale, and Szepesvári, Csaba. Learning with a strong adversary. 11 2015.
- Kindermans et al. (2016) Kindermans, Pieter-Jan, Schütt, Kristof, Müller, Klaus-Robert, and Dähne, Sven. Investigating the influence of noise and distractors on the interpretation of neural networks. arXiv:1611.07270 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07270. NIPS 2016 InterpretML.
- Krakovna & Doshi-Velez (2016) Krakovna, Viktoriya and Doshi-Velez, Finale. Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models. arXiv:1611.05934 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.05934. NIPS 2016 InterpretML.
- Krause et al. (2016) Krause, Josua, Perer, Adam, and Bertini, Enrico. Using Visual Analytics to Interpret Predictive Machine Learning Models. arXiv:1606.05685 [cs, stat], June 2016. URL http://arxiv.org/abs/1606.05685. WHI 2016 (ICML Workshop).
- Lee et al. (2017) Lee, Yongjin, arthel, Senja D. B, Dlotko, Pawel, Moosavi, S. Mohamad, Hess, Kathryn, and Smit, Berend. Quantifying similarity of pore-geometry in nanoporous materials. Nature Communications, 2017.
- Li et al. (2015) Li, Li, Cheng, Wei-Yi, Glicksberg, Benjamin S., Gottesman, Omri, Tamler, Ronald, Chen, Rong, Bottinger, Erwin P., and Dudley, Joel T. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Science Translational Medicine, 7(311), 2015.
- Lum et al. (2013) Lum, Pek Y, Singh, Gurjeet, Lehman, Alan, Ishkanov, Tigran, Vejdemo-Johansson, Mikael, Alagappan, Muthu, Carlsson, John, and Carlsson, Gunnar. Extracting insights from the shape of complex data using topology. Scientific Reports, 3, February 2013. ISSN 2045-2322. doi: 10.1038/srep01236. URL http://www.nature.com/srep/2013/130207/srep01236/full/srep01236.html.
- Lundberg & Lee (2016) Lundberg, Scott and Lee, Su-In. An unexpected unity among methods for interpreting model predictions. arXiv:1611.07478 [cs], November 2016. URL http://arxiv.org/abs/1611.07478. NIPS 2016 InterpretML.
- Moosavi-Dezfooli et al. (2016) Moosavi-Dezfooli, Seyed-Mohsen, Alhussein, Fawzi, and Frossard, Pascal. Deepfool: a simple and accurate method to fool deep neural networks. 11 2016.
- Murtagh & Contreras (2012) Murtagh, Fionn and Contreras, Pedro. Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1):86–97, 2012.
- Müllner & Babu (2013) Müllner, Daniel and Babu, Aravindakshan. Python mapper: An open-source toolchain for data exploration, analysis and visualization, 2013. URL http://danifold.net/mapper.
- Nicolau et al. (2011) Nicolau, Monica, Levine, Arnold J., and Carlsson, Gunnar. Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. PNAS, 108:7265–7270, 2011.
- Noh et al. (2017) Noh, Hyeonwoo, You, Tackgeun, Mun, Jonghwan, and Han, Bohyung. Regularizing deep neural networks by noise: Its interpretation and optimization. In Advances in Neural Information Processing Systems 30, 2017.
- Phillips et al. (2017) Phillips, Richard L., Chang, Kyu Hyun, and Friedler, Sorelle A. Interpretable Active Learning. arXiv:1708.00049 [cs, stat], July 2017. URL http://arxiv.org/abs/1708.00049. WHI 2017 (ICML Workshop).
- Reing et al. (2016) Reing, Kyle, Kale, David C., Steeg, Greg Ver, and Galstyan, Aram. Toward Interpretable Topic Discovery via Anchored Correlation Explanation. arXiv:1606.07043 [cs, stat], June 2016. URL http://arxiv.org/abs/1606.07043. ICML 2016 Workshop #Data4Good.
- Ribeiro et al. (2016a) Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin, Carlos. Model-Agnostic Interpretability of Machine Learning. arXiv:1606.05386 [cs, stat], June 2016a. URL http://arxiv.org/abs/1606.05386. WHI 2016 (ICML Workshop).
- Ribeiro et al. (2016b) Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin, Carlos. Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance. arXiv:1611.05817 [cs, stat], November 2016b. URL http://arxiv.org/abs/1611.05817. NIPS 2016 InterpretML.
- Ribeiro et al. (2016c) Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin, Carlos. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. arXiv:1602.04938 [cs, stat], February 2016c. URL http://arxiv.org/abs/1602.04938. arXiv: 1602.04938.
- Romano et al. (2014) Romano, David, Nicolau, Monica, Quintin, Eve-Marie, Mazaika, Paul K., Lightbody, Amy A., Hazlett, Heather Cody, Piven, Joseph, Carlsson, Gunnar, and Reiss, Allan L. Topological methods reveal high and low functioning neuro-phenotypes within fragile x syndrome. Human Brain Mapping, 35:4904––4915, 2014.
- Samek et al. (2016) Samek, Wojciech, Montavon, Grégoire, Binder, Alexander, Lapuschkin, Sebastian, and Müller, Klaus-Robert. Interpreting the Predictions of Complex ML Models by Layer-wise Relevance Propagation. arXiv:1611.08191 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.08191. NIPS 2016 InterpretML.
- Saul & van Veen (2017) Saul, Nathaniel and van Veen, Hendrik Jacob. Mlwave/kepler-mapper: 186f (version 1.0.1), 2017. URL http://doi.org/10.5281/zenodo.1054444.
- Savir et al. (2017) Savir, Aleksandar, Toth, Gergely, and Duponchel, Ludovic. Topological data analysis (tda) applied to reveal pedogenetic principles of european topsoil system. Science of the Total Environment, 586(2):1091–1100, 2017.
- Schneider et al. (2016) Schneider, David S., Torres, Brenda Y., Oliveira, Jose Henrique M., Tate, Ann Thomas, Rath, Poonam, and Cumnock, Katherine. Tracking resilience to infections by mapping disease space. PLOS Biology, 14(6), 2016.
- Selvaraju et al. (2016) Selvaraju, Ramprasaath R., Das, Abhishek, Vedantam, Ramakrishna, Cogswell, Michael, Parikh, Devi, and Batra, Dhruv. Grad-CAM: Why did you say that? arXiv:1611.07450 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07450. NIPS 2016 InterpretML.
- Simonyan et al. (2013) Simonyan, Karen, Vedaldi, Andrea, and Zisserman, Andrew. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
Singh et al. (2007)
Singh, Gurjeet, Mémoli, Facundo, and Carlsson, Gunnar.
Topological Methods for the Analysis of High Dimensional Data Sets and 3d Object Recognition.In SPBG, pp. 91–100, 2007. URL http://comptop.stanford.edu/preprints/mapperPBG.pdf.
- Singh et al. (2016) Singh, Sameer, Ribeiro, Marco Tulio, and Guestrin, Carlos. Programs as Black-Box Explanations. arXiv:1611.07579 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07579. NIPS 2016 InterpretML.
- Smilkov et al. (2016) Smilkov, Daniel, Thorat, Nikhil, Nicholson, Charles, Reif, Emily, Viégas, Fernanda B., and Wattenberg, Martin. Embedding Projector: Interactive Visualization and Interpretation of Embeddings. arXiv:1611.05469 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.05469. NIPS 2016 InterpretML.
- Tansey et al. (2017) Tansey, Wesley, Thomason, Jesse, and Scott, James G. Interpretable Low-Dimensional Regression via Data-Adaptive Smoothing. arXiv:1708.01947 [stat], August 2017. URL http://arxiv.org/abs/1708.01947. WHI 2017 (ICML Workshop).
- Thiagarajan et al. (2016) Thiagarajan, Jayaraman J., Kailkhura, Bhavya, Sattigeri, Prasanna, and Ramamurthy, Karthikeyan Natesan. TreeView: Peeking into Deep Neural Networks Via Feature-Space Partitioning. arXiv:1611.07429 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07429. NIPS 2016 InterpretML.
- Tosi et al. (2017) Tosi, Alessandra, Vellido, Alfredo, and Alvarez, Mauricio (eds.). Transparent and Interpretable Machine Learning in Safety Critical Environments, NIPS 2017 Workshop, 2017.
- Tramèr et al. (2018) Tramèr, Florian, Kurakin, Alexey, Papernot, Nicolas, Goodfellow, Ian, Boneh, Dan, and McDaniel, Patrick. Ensemble adversarial training: Attacks and defenses. International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rkZvSe-RZ. accepted as poster.
- Varshney et al. (2017) Varshney, Kush, Weller, Adrian, Kim, Been, and Malioutov, Dmitry (eds.). Human Interpretability in Machine Learning, ICML 2017 Workshop, 2017.
- Vidovic et al. (2016) Vidovic, Marina M.-C., Görnitz, Nico, Müller, Klaus-Robert, and Kloft, Marius. Feature Importance Measure for Non-linear Learning Algorithms. arXiv:1611.07567 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07567. NIPS 2016 InterpretML.
- Whitmore et al. (2016) Whitmore, Leanne S., George, Anthe, and Hudson, Corey M. Mapping chemical performance on molecular structures using locally interpretable explanations. arXiv:1611.07443 [physics, stat], November 2016. URL http://arxiv.org/abs/1611.07443. NIPS 2016 InterpretML.
- Wilson et al. (2016) Wilson, Andrew Gordon, Kim, Been, and Herlands, William (eds.). Interpretable Machine Learning for Complex Systems, NIPS 2016 Workshop, 2016.
- Wilson et al. (2017) Wilson, Andrew Gordon, Yosinski, Jason, Simard, Patrice, Caruana, Rich, and Herlands, William (eds.). Interpretable ML Symposium, NIPS 2017 Workshop, 2017.
- Wisdom et al. (2016) Wisdom, Scott, Powers, Thomas, Pitton, James, and Atlas, Les. Interpretable Recurrent Neural Networks Using Sequential Sparse Recovery. arXiv:1611.07252 [cs, stat], November 2016. URL http://arxiv.org/abs/1611.07252. NIPS 2016 InterpretML.
- Yuan et al. (2018) Yuan, Xiaoyong, He, Pan, Zhu, Qile, Bhat, Rajendra Rana, and Li, Xiaolin. Adversarial examples: Attacks and defenses for deep learning. 1 2018.
Zhou et al. (2017)
Zhou, Yiren, Song, Sibo, and Cheung, Ngai-Man.
On classification of distorted images with deep convolutional neural networks.arXiV, 2017.
- Zrihem et al. (2016) Zrihem, Nir Ben, Zahavy, Tom, and Mannor, Shie. Visualizing Dynamics: from t-SNE to SEMI-MDPs. arXiv:1606.07112 [cs, stat], June 2016. URL http://arxiv.org/abs/1606.07112. WHI 2016 (ICML Workshop).