1 Introduction
In cognitive science, researchers usually analyze recorded brain activity to discover the answers of where, when, and how a brain region participates in a particular cognitive process. To answer the key questions in cognitive science, scientists often employ massunivariate hypothesis testing methods to test scientific hypotheses on a large set of independent variables Groppe et al. (2011)
. On the down side, the high dimensionality of neuroimaging data requires a large number of tests that reduces the sensitivity of these methods after multiple comparison correction. The multivariate counterparts of massunivariate analysis, known generally as multivariate pattern analysis (MVPA), have the potential to overcome this deficit.
Brain decoding Haynes & Rees (2006) is an MVPA technique that delivers a model to predict the mental state of a human subject based on the recorded brain signal. From the neuroscientific perspective, a brain map resulting from weight of linear brain decoding model is considered interpretable if it enables the scientist to answer where, when, and how questions. But typically a classifier, taken alone, only answers the question of what is the most likely label of a given unseen sample. This fact is generally known as knowledge extraction gap Vellido et al. (2012)
in the classification context. Thus far, many efforts have been devoted to filling the knowledge extraction gap of linear and nonlinear data modeling methods in different areas such as computer vision
Bach et al. (2015), signal processing Montavon et al. (2013), chemometrics Yu et al. (2015), bioinformatics Hansen et al. (2011), and neuroinformatics Haufe et al. (2013).Despite the theoretical advantages of MVPA, its practical application to inferences regarding neuroimaging data is limited primarily due to the knowledge extraction gap Sabuncu (2014). Therefore, improving the interpretability of linear brain decoding and associated brain maps is a primary goal in the brain imaging literature Strother et al. (2014). The lack of interpretability of multivariate brain maps is a direct consequence of low signaltonoise ratios (SNRs), high dimensionality of wholescalp recordings, high correlations among different dimensions of data, and crosssubject variability. At present, two main approaches are proposed to enhance the interpretability of multivariate brain maps: 1) introducing new metrics, such as reproducibility of maps or stability of models, into the model selection procedure Rasmussen et al. (2012); Conroy et al. (2013); Yu (2013), and 2) introducing new hybrid penalty terms for regularization to incorporate spatiotemporal prior knowledge in the learning van Gerven et al. (2009); Michel et al. (2011); de Brecht & Yamagishi (2012); Grosenick et al. (2013).
In spite of the aforementioned efforts to improve the interpretability, there is still no formal definition for the interpretability of brain decoding in the literature. Therefore, the interpretability of different brain decoding methods are evaluated either qualitatively or indirectly. With the aim of filling this gap, our contribution is twofold: 1) assuming that the true solution of brain decoding is available, we present a simple definition of the interpretability in linear brain decoding; 2) we propose the combination of the interpretability and the performance of the brain decoding as a new Pareto optimal multiobjective criterion for model selection. We experimentally, on a toy dataset, show that incorporating the interpretability into the model selection procedure provides more interpretable models ^{1}^{1}1For further experiments on the real dataset please see Kia (2016).
2 Methods
2.1 Notation and Background
Let be a manifold in Euclidean space that represents the input space and be the output space, where . Then, let be a training set of
independently and identically distributed (iid) samples drawn from the joint distribution of
. In the neuroimaging context, X indicates the trials of brain recording, and Y represents the experimental conditions. The goal of brain decoding is to find the functionas an estimation of the ideal function
.As is a common assumption in the neuroimaging context, we assume the true solution of a brain decoding problem is among the family of linear functions . Therefore, the aim of brain decoding reduces to finding an empirical approximation of , indicated by , among all . This approximation can be obtained by solving a risk minimization problem:
(1) 
where denotes the parameters of the linear model,
is the loss function,
is the regularization term, and is a hyperparameter that controls the amount of regularization. is generally decided using crossvalidation or other data perturbation methods in the model selection procedure.The estimated parameters of a linear decoding model
can be used in the form of a brain map so as to visualize the discriminative neurophysiological effect. We refer to the normalized parameter vector of a linear brain decoder in the unit hypersphere as a multivariate brain map (MBM); we denote it by
where ( is the 2norm).As shown in Eq. 1, learning occurs using the sampled data. In other words, in the learning paradigm, we attempt to minimize the loss function with respect to (and not ) Poggio & Shelton (2002). The irreducible error is the direct consequence of sampling; it sets a lower bound on the error, where we have:
(2) 
2.2 Theoretical Definition
In this section, we present a definition for the interpretability of linear brain decoding models and their associated MBMs. Our definition of interpretability is based on two main assumptions: 1) the brain decoding problem is linearly separable; 2) its unique and neurophysiologically plausible solution, i.e., , is available.
Consider a linearly separable brain decoding problem in an ideal scenario where and . In this case, is linear and its parameters are unique and plausible. The unique parameter vector can be computed by:
(3) 
represents the covariance of X. Using as the reference, we can define the stronginterpretability:
Definition 1.
An MBM associated with a linear function is “stronglyinterpretable” if and only if .
In practice, the estimated solution of a linear brain problem is not stronglyinterpretable because of the inherent limitations of neuroimaging data, such as uncertainty Aggarwal & Yu (2009) in the input and output space (), the high dimensionality of data (), and the high correlation between predictors (). With these limitations in mind, even though the solution of linear brain decoding is not stronglyinterpretable, one can argue that some are more interpretable than others. For example, in the case in which , a linear classifier where can be considered more interpretable than a linear classifier where . This issue raises the following question:
Problem.
Let be perturbed training sets drawn from via a certain perturbation scheme such as bootstrapping, or crossvalidation. Assume are MBMs of a certain on the corresponding perturbed training sets. How can we quantify the proximity of to the stronglyintrepretable solution of brain decoding problem ?
Considering the uniqueness and the plausibility of as the two main characteristics that convey its stronginterpretability, we define the interpretability as follows:
Definition 2.
Let () be the angle between and . The “interpretability” () of the MBM derived from a linear function is defined as:
(4) 
In fact, the interpretability is the average cosine similarities between
and MBMs derived from different samplings of the training set. Even though, in practice, the exact computation of is unrealistic (asis not available), the interpretability of the decoding model can be approximated based on adhoc heuristics (see
Kia (2016) for an example in the magnetoenecephalography decoding). The approximated interpretability can be incorporated in the model selection procedure in order to find more reproducible and plausible decoding models.2.3 Interpretability in Model Selection
The procedure for evaluating the performance of a model so as to choose the best values for hyperparameters is known as model selection Hastie et al. (2009). This procedure generally involves numerical optimization of the model selection criterion. The most common model selection criterion is based on an estimator of generalization performance. In the context of brain decoding, especially when the interpretability of brain maps matters, employing the predictive power as the only decisive criterion in model selection is problematic Rasmussen et al. (2012); Conroy et al. (2013). Here, we propose a multiobjective criterion for model selection that takes into account both prediction accuracy and MBM interpretability.
Let and be the interpretability and the generalization performance of a linear function , respectively. We propose the use of the scalarization technique Caramia & Dell´ Olmo (2008) for combining and into one scalar as follows:
(5) 
where and are weights that specify the importance of the interpretability and the performance, respectively. is a threshold that filters out solutions with poor performances. In classification scenarios, can be set by adding a small safe interval to the chance level. It can be shown that the hyperparameters of a model are optimized based on are Pareto optimal Marler & Arora (2004).
2.4 Classification and Evaluation
In our experiment, a least squares classifier with L1penalization, i.e., Lasso Tibshirani (1996), is used for decoding. Lasso is a popular classification method in brain decoding, mainly because of its sparsity assumption. The choice of Lasso helps us to better illustrate the importance of including the interpretability in the model selection. Lasso solves the following optimization problem:
(6) 
where is the hyperparameter that specifies the level of regularization. Therefore, the aim of the model selection is to find the best value for . Here, we try to find the best regularization parameter value among .
We use the outofbag (OOB) Breiman (2001) method to compute , , and for different values of . In OOB, given a training set , replications of bootstrap are used to create perturbed training sets (we set ) ^{2}^{2}2The MATLAB code used for experiments is available at https://github.com/smkia/interpretability/. We set and in the computation of . Furthermore, we set where EPE indicates the expected prediction error.
3 Experiment
3.1 Experimental Material
To illustrate the importance of integrating the interpretability of brain decoding with the model selection procedure, we use simple 2dimensional toy data presented in Haufe et al. (2013). Assume that the true underlying generative function is defined by:
where ; and and
represent the first and the second dimension of the data, respectively. Furthermore, assume the data is contaminated by Gaussian noise with covariance
. Figure 1 shows the distribution of the noisy data.3.2 Results
0  0.001  0.01  0.1  1  10  50  100  250  500  1000  
0.9883  0.9883  0.9883  0.9883  0.9883  0.9884  0.9880  0.9840  0.9310  0.9292  0.9292  
0.4391  0.4391  0.4391  0.4392  0.4400  0.4484  0.4921  0.5845  0.9968  1  1  
0.7137  0.7137  0.7137  0.7137  0.7142  0.7184  0.7400  0.7842  0.9639  0.9646  0.9646  
In the definition of on the toy dataset, is the decisive variable and has no effect on the classification of the data into target classes. Therefore, excluding the effect of noise and based on the theory of the maximal margin classifier, is the true solution to the decoding problem. By accounting for the effect of noise and solving the decoding problem in space, we have as the parameter of the linear classifier. Although the estimated parameters on the noisy data yield the best generalization performance for the noisy samples, any attempt to interpret this solution fails, as it yields the wrong conclusion with respect to the ground truth (it says has twice the influence of on the results, whereas it has no effect). This simple experiment shows that the most accurate model is not always the most interpretable one, primarily because the contribution of the noise in the decoding process Haufe et al. (2013). On the other hand, the true solution of the problem does not provide the best generalization performance for the noisy data.
To illustrate the effect of incorporating the interpretability in the model selection, a Lasso model with different values is used for classifying the toy data. In this case, because is known, the interpretability can be computed using Eq. 4. Table 1 compares the resultant performance and interpretability from Lasso. Lasso achieves its highest performance () at with (indicated by the magenta line in Figure 1). Despite having the highest performance, this solution suffers from a lack of interpretability (). By increasing , the interpretability improves so that for the classifier reaches its highest interpretability by compensating for of its performance. Our observation highlights two main points: 1) In the case of noisy data, the interpretability of a decoding model is incoherent with its performance. Thus, optimizing the parameter of the model based on its performance does not necessarily improve its interpretability. This observation confirms the previous finding by Rasmussen et al. (2012) regarding the tradeoff between the spatial reproducibility (as a measure for the interpretability) and the prediction accuracy in brain decoding; 2) if the right criterion is used in the model selection, employing proper regularization technique (sparsity prior, in this case) leads to more interpretability for the decoding models.
4 Discussions
In this study, our primary interest was to present a definition of the interpretability of linear brain decoding models. Our definition and quantification of interpretability remains theoretical, as we assume that the true solution of the brain decoding problem is available. Despite this limitation, we argue that the presented simple definition provides a concrete framework of a previously abstract concept and that it establishes a theoretical background to explain an ambiguous phenomenon in the brain decoding context.
Despite ubiquitous use, the generalization performance of classifiers is not a reliable criterion for assessing the interpretability of brain decoding models Rasmussen et al. (2012). Therefore, considering extra criteria might be required. However, because of the lack of a formal definition for interpretability, different characteristics of brain decoding models are considered as the main objective in improving their interpretability. Our definition of interpretability helped us to fill this gap by introducing a new multiobjective criterion as a weighted compromise between interpretability and generalization performance. Furthermore, this work presents an effective approach for evaluating the quality of different regularization strategies for improving the interpretability of MBMs. Our findings provide a further step toward direct evaluation of interpretability of the currently proposed penalization strategies.
Despite theoretical advantages, the proposed definition of interpretability suffer from some limitations. The presented concepts are defined for linear models, with the main assumption that (where is a class of linear functions). Extending the definition of interpretability to nonlinear models demands future research in visualization of nonlinear models in the form of brain maps.
References
 Aggarwal & Yu (2009) Aggarwal, Charu C and Yu, Philip S. A survey of uncertain data algorithms and applications. Knowledge and Data Engineering, IEEE Transactions on, 21(5):609–623, 2009.
 Bach et al. (2015) Bach, Sebastian, Binder, Alexander, Montavon, Grégoire, Klauschen, Frederick, Müller, KlausRobert, and Samek, Wojciech. On pixelwise explanations for nonlinear classifier decisions by layerwise relevance propagation. PloS one, 10(7), 2015.
 Breiman (2001) Breiman, Leo. Random forests. Machine learning, 45(1):5–32, 2001.
 Caramia & Dell´ Olmo (2008) Caramia, Massimiliano and Dell´ Olmo, Paolo. Multiobjective optimization. Multiobjective Management in Freight Logistics: Increasing Capacity, Service Level and Safety with Optimization Algorithms, pp. 11–36, 2008.
 Conroy et al. (2013) Conroy, Bryan R, Walz, Jennifer M, and Sajda, Paul. Fast bootstrapping and permutation testing for assessing reproducibility and interpretability of multivariate fmri decoding models. PloS one, 8(11):e79271, 2013.
 de Brecht & Yamagishi (2012) de Brecht, Matthew and Yamagishi, Noriko. Combining sparseness and smoothness improves classification accuracy and interpretability. NeuroImage, 60(2):1550–1561, 2012.
 Groppe et al. (2011) Groppe, David M, Urbach, Thomas P, and Kutas, Marta. Mass univariate analysis of eventrelated brain potentials/fields i: A critical tutorial review. Psychophysiology, 48(12):1711–1725, 2011.
 Grosenick et al. (2013) Grosenick, Logan, Klingenberg, Brad, Katovich, Kiefer, Knutson, Brian, and Taylor, Jonathan E. Interpretable wholebrain prediction analysis with graphnet. NeuroImage, 72:304–321, 2013.
 Hansen et al. (2011) Hansen, Katja, Baehrens, David, Schroeter, Timon, Rupp, Matthias, and Müller, KlausRobert. Visual interpretation of kernelbased prediction models. Molecular Informatics, 30(9):817–826, 2011.
 Hastie et al. (2009) Hastie, Trevor, Tibshirani, Robert, and Friedman, Jerome. The elements of statistical learning, volume 2. Springer, 2009.
 Haufe et al. (2013) Haufe, Stefan, Meinecke, Frank, Görgen, Kai, Dähne, Sven, Haynes, JohnDylan, Blankertz, Benjamin, and Bießmann, Felix. On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 2013.
 Haynes & Rees (2006) Haynes, JohnDylan and Rees, Geraint. Decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7(7):523–534, July 2006. ISSN 1471003X. doi: 10.1038/nrn1931.
 Kia (2016) Kia, Seyed Mostafa. Interpretability of multivariate brain maps in brain decoding: Definition and quantification. bioRxiv, 2016. doi: 10.1101/047522.
 Marler & Arora (2004) Marler, R Timothy and Arora, Jasbir S. Survey of multiobjective optimization methods for engineering. Structural and multidisciplinary optimization, 26(6):369–395, 2004.
 Michel et al. (2011) Michel, Vincent, Gramfort, Alexandre, Varoquaux, Gaël, Eger, Evelyn, and Thirion, Bertrand. Total variation regularization for fmribased prediction of behavior. Medical Imaging, IEEE Transactions on, 30(7):1328–1340, 2011.
 Montavon et al. (2013) Montavon, Gregoire, Braun, Martin, Krueger, Thomas, and Muller, KlausRobert. Analyzing local structure in kernelbased learning: Explanation, complexity, and reliability assessment. Signal Processing Magazine, IEEE, 30(4):62–74, 2013.
 Poggio & Shelton (2002) Poggio, T and Shelton, CR. On the mathematical foundations of learning. American Mathematical Society, 39(1):1–49, 2002.
 Rasmussen et al. (2012) Rasmussen, Peter M, Hansen, Lars K, Madsen, Kristoffer H, Churchill, Nathan W, and Strother, Stephen C. Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognition, 45(6):2085–2100, 2012.
 Sabuncu (2014) Sabuncu, Mert R. A universal and efficient method to compute maps from imagebased prediction models. Medical Image Computing and ComputerAssisted Intervention–MICCAI 2014, pp. 353–360, 2014.
 Strother et al. (2014) Strother, Stephen C, Rasmussen, Peter M, Churchill, Nathan W, and Hansen, KL. Stability and Reproducibility in fMRI Analysis. New York: SpringerVerlag, 2014.
 Tibshirani (1996) Tibshirani, Robert. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996.
 van Gerven et al. (2009) van Gerven, Marcel, Hesse, Christian, Jensen, Ole, and Heskes, Tom. Interpreting single trial data using groupwise regularisation. NeuroImage, 46(3):665–676, 2009.

Vellido et al. (2012)
Vellido, Alfredo, MartinGuerroro, JD, and Lisboa, P.
Making machine learning models interpretable.
In
Proceedings of the 20th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN). Bruges, Belgium
, pp. 163–172, 2012.  Yu (2013) Yu, Bin. Stability. Bernoulli, 19(4):1484–1500, 2013.

Yu et al. (2015)
Yu, Donghyeon, Lee, Seul Ji, Lee, Won Jun, Kim, Sang Cheol, Lim, Johan, and
Kwon, Sung Won.
Classification of spectral data using fused lasso logistic regression.
Chemometrics and Intelligent Laboratory Systems, 142:70–77, 2015.