Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry

by   Alexander Bernstein, et al.

We consider a problem of diagnostic pattern recognition/classification from neuroimaging data. We propose a common data analysis pipeline for neuroimaging-based diagnostic classification problems using various ML algorithms and processing toolboxes for brain imaging. We illustrate the pipeline application by discovering new biomarkers for diagnostics of epilepsy and depression based on clinical and MRI/fMRI data for patients and healthy volunteers.



There are no comments yet.


page 1

page 2

page 3

page 4


fMRI: preprocessing, classification and pattern recognition

As machine learning continues to gain momentum in the neuroscience commu...

Classification of Alzheimer's Disease using fMRI Data and Deep Learning Convolutional Neural Networks

Over the past decade, machine learning techniques especially predictive ...

Pattern Recognition of Bearing Faults using Smoother Statistical Features

A pattern recognition (PR) based diagnostic scheme is presented to ident...

Computational Performance of a Germline Variant Calling Pipeline for Next Generation Sequencing

With the booming of next generation sequencing technology and its implem...

Inferring health conditions from fMRI-graph data

Automated classification methods for disease diagnosis are currently in ...

Computer Assisted Localization of a Heart Arrhythmia

We consider the problem of locating a point-source heart arrhythmia usin...

fMRI Neurofeedback Learning Patterns are Predictive of Personal and Clinical Traits

We obtain a personal signature of a person's learning progress in a self...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The human brain is a complex system of interconnected and specialized structures, the functioning of which is associated with the numerous ongoing biophysical and biochemical processes. These processes differ in healthy people and in patients with various pathologies. Nowadays, the normal and pathological processes related to the brain structure and functioning could be recognized by analyzing the results of medical examination with the use of in-vivo scanning devices.

In clinical practice, neuroimaging data of each patient is considered individually, either visually by doctor/neuroradiologist or by analyzing the clinically meaningful features (cortical volumes, thicknesses, etc.). Nowadays, Artificial Intelligence (AI), Machine Learning (ML) and Intelligent data analysis techniques are used in medical research for diagnostic biomarkers discovery and the treatment outcomes prediction with the use of neuroimaging data collected for the targeted groups of patients or healthy volunteers

Bruijne (2016).

In this article we confine ourselves to the problems of diagnostic pattern recognition/classification from neuroimaging data. The features (called biomarkers), which distinguish different groups of examined subjects, are extracted from neuroimaging data and further used in clinical practice for the diagnostic purposes. Biomarkers (characteristics which are objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention) are key components of modern medicine Fu and Costafreda (2013). There is an ever-growing number of ML studies for detecting new clinically meaningful biomarkers from large neuroimaging datasets Bruijne (2016).

Important key feature of neuroimaging data is its high dimensionality. For example, MRI signals for usual human brain with a volume of approximately cubic millimeters are represented by a 3D array with a total dimensionality of the order of , and fMRI images are represented by 4D-array of 3D images of lower resolution (about voxels) with a total dimensionality of the order of

. Thus, the curse of dimensionality phenomenon is often an obstacle for using ML techniques. To avoid this phenomenon, various universal dimensionality reduction methods

Burges (2010); Sorzano et al. (2014); Bernstein and Kuleshov (2014); Chernova and Burnaev (2015)

and/or specific neuroimaging-oriented feature selection methods

Mwangi et al. (2014) are used for extracting low-dimensional features from high-dimensional neuroimaging data. After that, ML algorithms are applied to these features. Such clinically meaningful features can be computed by brain image processing toolboxes Behroozi and Daliri (2012); Bernstein et al. (2018).

As a result of ML application we obtain not only a classifier to support medical diagnosis, but also after posterior analysis of the classifier properties we identify features, which have biomedical interpretation and can be used for medical conclusions. Therefore, ML-based neuroimaging data processing for medical diagnostics is a multistage iterative process, which uses various ML feature selection, extraction and classification algorithms, as well as domain-specific knowledge.

In this paper we propose a common data analysis pipeline for neuroimaging-based diagnostic classification problems using various ML algorithms and brain imaging toolboxes. The proposed pipeline consists of several stages, each of these stages can be executed several times in an iterative mode. Each stage, in turn, contains a number of different algorithms which can be split into levels, and each level consists of algorithms solving the same ML problem but based on different mathematical approaches. A composite algorithm, which performs (sequentially or in an iterative way) a data processing task using a particular set of algorithms from various stages and levels is called an algorithmic chain.

We illustrate the pipeline application by discovering new biomarkers for diagnostic of epilepsy and depression based on clinical and MRI/fMRI data for patients and healthy volunteers. This study was performed in collaboration with physicians from Russian Scientific and practical psycho-neurological center named after Z.P. Solovyov (NPCPN (Skoltech biomedical partner), which provided medical data and biomedical expertise.

The paper is organized as follows. Section 2 briefly describes main properties of processed MRI/fMRI data and provides some details of the medical diagnostic tasks; solutions of these tasks are considered later as examples. Section 3 describes the main structure of the proposed pipeline (stages and levels of each stage) and specifies tasks solved at various levels. Section 4 describes the first preprocessing stage in which various data cleaning procedures, neuroimaging toolboxes and dimensionality reduction procedures are used to obtain domain-specific features, which are used as inputs for further Exploratory ML/data analysis step. At this step, described in Section 5, various ML/data analysis techniques Vapnik ; Gareth et al. ; Hastie et al. ; Bishop ; Burges (2010); Sorzano et al. (2014); Bernstein and Kuleshov (2014)

are applied to the neuroimaging data or features extracted from it to select important features providing high classification accuracy. In Section

6 conclusions are provided.

2 Processed MRI/fMRI data

2.1 Properties of processed MRI/fMRI data

Processed data consists of structural and functional MR images. The structural MRI protocols are aimed at providing the information concerning brain structure, enhancing required tissue patterns when using different acquisition modalities. Functional MRI (fMRI) scanning regime is based on measurement of the blood oxygen-level dependent contrast (BOLD) that is directly related to neuronal activity and thus reflects brain functioning.

MRI data once acquired should be cleaned to eliminate the noise associated with the scanning procedure (low-level hardware artefacts such as magnetic field inhomogeneity, radiofrequency noise, surface coil artefacts and others) and signal processing (chemical shift , partial volume, etc.); besides there are artefacts associated with the scanned patient (physiological noise such as blood flow, movements, etc.) Erasmus et al. (2004). MRI/fMRI signals cleaning is one of the tasks solved at Preprocessing stage (see Section 3 below).

In addition to MRI data cleaning problem, there is another common challenge of the brain imaging analysis related to big data dimensionality, which mostly depends on resolution parameters of the scanner inductive detection coil. For instance, standard voxel sizes are within in case of structural imaging (resulting in voxels for the whole brain volume) and in a functional MRI series (resulting in voxels). Thus an MRI image, composed of huge number of small sized voxels, has higher spatial resolution and, hence, high dimensionality. To avoid the curse of the dimensionality phenomenon, ML methods are usually applied to lower dimensional features extracted from original scans by feature selection procedures. These procedures are also included into the Preprocessing stage.

2.2 Data used in illustrative examples

Data, provided by Skoltech’s biomedical partner (the NPCPN), consists of structural and functional MR images. The considered dataset contains structural MRI and resting state functional MRI images of patients: healthy volunteers and 25 patients with major depressive disorder in an acute depressive episode, as well as 25 epilepsy patients and 25 epilepsy patients with major depressive disorder. The dataset is enriched with clinical information including gender, age, disease duration, BDI (Beck Depression Inventory scaling) and other typical medical indicators. There are some patients with temporal lobe epilepsy (TLE) with and without MRI evidence for structural lesion (named TLE MRI Positive/ TLE MRI Negative groups)

In order to find functional biomarkers of depression and epilepsy we explored functional MRI EPI series ( voxels) repeated times with repetition time (TR) of seconds, and weighted MPR images ( voxels). The structural data was preprocessed in FreeSurfer toolbox FreeSurfer (2018)

, resulting in a vector of morphological features with dimensionality

. The functional data was preprocessed in Nilearn toolbox Nilearn (2018), and functional connectivity graph based features were retrieved using Networkx library Networkx (2018) resulting in a vector of dimension .

We considered different diagnostic tasks, taking for each of them the data from different subgroups of patients as inputs to ML procedures:

  • Epilepsy (including patients with Depression) versus No Epilepsy (including patients with Depression) classification (EvsNE subjects);

  • Depression (including patients with Epilepsy) versus No Depression (including patients with Epilepsy) classification (DvsND subjects);

  • Epilepsy versus Healthy Control classification (EvsH subjects);

  • Depression versus Healthy Control classification (DvsH subjects);

  • Temporal Lobe Epilepsy (including patients with Depression) versus Healthy Control classification (TLEvsH subjects);

  • MRI Positive Temporal Lobe Epilepsy (including patients with Depression) versus Healthy Control classification (TLEPvsH subjects);

  • MRI Negative Temporal Lobe Epilepsy (including patients with Depression) versus Healthy Control classification (TLENvsH subjects);

  • Non Temporal Lobe Epilepsy (including patients with Depression) versus Healthy Control classification (NTLEvsH subjects);

and others.

3 Pipeline structure for neuroimaging-based Machine Learning diagnostics

Neuroimaging-based Machine Learning diagnostic task formulation consists of the following elements:

  • List of possible diagnostic inferences (diagnoses, hypotheses, etc.), which are tested for an individual/patient by analyzing his/her neuroimaging and clinical data. For example, a person has a specific disease such as depression (D) or epilepsy (E) vs he/she is healthy (H) (abbreviations DvsH and EvsH, respectively); or the presence or absence of depression in epilepsy patients (abbreviation DEvsE),

  • Dataset with appropriate neuroimaging and clinical data, collected for target groups of subjects with known diagnostic inferences from the considered list (in case of examples, given above, this should be a group of patients with depression or epilepsy and a group of healthy volunteers; a group of patients with established depression and epilepsy diagnosis and a group of patients with epilepsy but without depression).

The task is to establish a diagnosis for previously unseen patient from his/her neuroimaging and clinical data. In Machine learning terms, this task is reduced to a supervised classification problem based on labeled data.

Proposed pipeline for the solution of this task consists of four stages, namely:

  • Preprocessing stage;

  • Exploratory Machine Learning/Data analysis stage;

  • Inference stage;

  • Quality assessment stage.

Each stage, in turn, contains a number of different algorithms which can be split into levels, and each level consists of algorithms solving the same ML problem but based on different mathematical approaches.

At the Preprocessing stage MRI/fMRI data is transformed into various representations which then will be used as inputs for chosen Machine Learning procedures. This stage has several goals:

  • neuroimaging data cleaning (denoising/noise reduction, removal of artefacts)using brain imaging software FreeSurfer (2018); SPM12 (2018); FSL (2018); Afni (2018); ArtRepair (2018); Behroozi and Daliri (2012); Bernstein et al. (2018) and other data analysis techniques Sharaev et al. (2018a);

  • transforming original high-dimensional data into biomedically motivated brain characteristics (clinically meaningful features) with lower dimensionality such as vectors consisting of volumetric characteristics of chosen brain areas (Hippocampus, Lateralorbitofrontal, etc.), connectivity matrices, directed graph describing the brain connectome and preserving directions of information transfer also using brain imaging software

    Behroozi and Daliri (2012); Bernstein et al. (2018). We call such features a priori domain-specific features;

  • computing new mathematical characteristics of the constructed mathematical objects (vectors, matrices, graphs) which describe various clinically meaningful properties of these objects (for example, constructing directed flag complex from the directed graphs representing connectivity among brain areas and computing its persistent homology characteristics such as Betti numbers, Euler characteristic, etc. Buchstaber and T.E. (2015)). These topological features are now used in neuroimaging studies for discovering “deep” structure of the brain connectomes and are thought to be promising diagnostic biomarkers Bullmore and Sporns (2009); Reimann et al. (2015); Snasel et al. (2017); Garg (2017).

  • transforming original high-dimensional data or clinically meaningful features into their low-dimensional representations to avoid the curse of the dimensionality, by preserving clinically meaningful information using various feature selection/dimensionality reduction techniques Burges (2010); Sorzano et al. (2014); Bernstein and Kuleshov (2014); Mwangi et al. (2014); Thirion and Faugeras (2004); Shen and Meyer (2007).

The result of this stage are datasets consisting of objects such as vectors, matrices, graphs, with common name “Machine Learning Input” (MLI) data. The details of this stage are given in Section 4 below.

In Exploratory Machine Learning/Data analysis stage, given constructed MLI-datasets, various ML techniques are applied to them. Obviously, the choise of the algorithm depends on the data structure: Support Vector Machine Classifier (SVC)

Cortes and Vapnik (1995)

, Logistic Regression Classifier (LR)

Hosmer and Lemeshow

, Random Forest Classifier (RFC)

Liaw and Wiener (2002)

, K Nearest Neighbors Classifier (KNN)

Altman (1992), Extra Trees Classifier (ETC) Geurts et al. (2006)

, Neural Networks

Erofeev and Burnaev (2016); Prikhod’ko and Burnaev (2013)

including 3D Deep convolutional neural networks

Notchenko et al. (2018)

as well as anomaly detection and imbalanced classification methods

Smolyakov and Burnaev (2016); Smolyakov et al. ; Papanov et al. (2015) are applied to vectorized MLI-data; kernel-based classifiers Shawe-Taylor and Cristianini (2004), Wang et al. (2010) are applied to connectivity graphs with different graph kernels Ghosh et al. (2018).

Each of the performed Machine Learning experiments is defined by a chosen triplet (Classification task, MLI-datasets, Machine Learning algorithm). For example, the triplet (EvsH, DS name, RFC) means that dataset with specific name (see Section 4 for details) is used for establishing diagnosis Epilepsy using Random Forest Classifier algorithm.

Each of the used algorithms is defined by a number of “free parameters” and their “optimal” values are determined during experiments. Common multiple-fold cross-validation technique Lachenbruch and Mickey (1968) is usually used for this purpose.

The result of performed Machine Learning experiment is the constructed classifier (with tuned parameters) and its quality characteristics estimated using final cross-validation procedure (for example, leave-one-out cross-validation

Wong (2015)). If possible, we also extract clinically meaningful features (called a posteriori task-specific features), based on which the classifier makes its decision.

In addition, subject-oriented classification results are collected from all performed experiments and are saved in specific Personal Classification Quality (PCQ) table. The table includes anonymized information about all subjects whose neuroimaging and clinical data is used in the study. Each row of this table corresponds to a particular subject, columns of the table are split into groups. Zero group contains personal information about individuals (their personal identifiers, ’s), and, if it is convenient for subsequent analysis of the table, a part of their clinical data — for example, clinical status CSID (, , , , or other), considered as labels in classification tasks. Each other group corresponds to results of a particular performed ML experiment (). A sub-table, defined by , contains the following elements: a symbol indicating whether the data of the subject ID is used in the experiment () or not (); the number of cross-validation (CV) experiments in which subject “participated”; the numbers of cross-validation experiments (among ) where a particular classifier makes decision , .

Statistical analysis of this table allows making various conclusions. For example, let us assume that the clinical status is (healthy). Then averaged frequencies


where takes values and , are equal to True Positive (TP) and False Positive (FP) rates of the used classifier when and , respectively.

Also for the considered classifier we can perform statistical analysis of the set , consisting of frequencies


of TP decisions for individuals with clinical status , whose data is used in the experiment MLE. Here is the number of such individuals. These characteristics make it possible to understand if the classifier works “statistically equally” for all such individuals or not.

If based on (2) we make a positive conclusion about statistically comparable results, then we can use the dataset to estimate accuracy of the computed TP and FP rates (1) for the constructed classifier, as well as construct prediction regions for these rates for new individuals using conformal prediction framework Vovk et al. (2005); Glenn and Vovk (2008); Harris et al. (2011); Vovk and Burnaev (2014); Nazarov and Burnaev (2016).

If based on (2) we make a negative conclusion, then the individuals can be split into clusters with “approximately equal” personal qualities of classification. In the inference stage this allows

  • to find possible dependencies between personal quality of classification (for a considered classifier) of the individual and his/her clinical data,

  • to construct ensembles of classifiers (if the clusters for different classifiers differ between themselves) using personal clinical data of an individual as additional input parameters when the ensemble is applied to calculate predictions for this individual.

The details of this stage are given in Section 5.

In the Inference stage, given a number of constructed classifiers, we discover a posteriori task-specific clinically meaningful features (determined by specific classifiers), and Personal Classification Quality (PCQ) table containing results of all performed MLE.

In this stage, final composite classifiers for a specific classification task are constructed using

  • either known Machine Learning approaches (e.g. ensembles of selected “good” classifiers, constructed for the same task in the previous stage with taking into account results of statistical analysis of the PCQ-table),

  • or by performing new MLE with input features, selected among discovered task-specific features, for both the considered task and other “clinically related” tasks.

For example, in case of the EvsDE classification task (diagnostics of depression for patients with epilepsy), task-specific features (including features from “clinically related” classification tasks) can be selected among features of

  • EvsDE classification task (based on MRI-data),

  • EvsH classification task (based on MRI-data),

  • DvsH classification task (based on fMRI-data),

and can be used as a set of new “combined” features to improve solution of the EvsDE classification task.

Classification results, corresponding to each particular subject, are saved in the PCQ table and can be used to estimate classification quality and to choose the most accurate classifier.

Note that usage of the same data in multiple successively executed steps can lead to over-fitting and, therefore requires both stratified division of samples into training/validation and testing sub-samples in cross-validation procedures and multiple cross-checks.

The MLE performed with such combined features showed this approach to be promising in discovering neuroimaging-based biomarkers in neurology and psychiatry, see details in Section 5.

4 Preprocessing stage

Preprocessing stage has two main goals: MRI/fMRI data cleaning and avoiding the curse of the dimensionality phenomenon caused by high dimensionality of initial MRI/fMRI data. The latter goal can be achieved by constructing lower dimensional biomedically significant brain characteristics from the initial data.

4.1 MRI/fMRI data cleaning

Data cleaning include procedures for 3D MRI images denoising and removing artefacts (caused by various reasons) from 4D fMRI signals.

MRI data cleaning. An artefact is a feature appearing in an image that is not present in the original object. Depending on their origin, artefacts are typically classified as patient-related (motion, blood flow), signal processing dependent (chemical shift, partial volume) and hardware-related (magnetic field inhomogeneity, radiofrequency noise, surface coil artefacts and others) Erasmus et al. (2004).

Some denoising procedures are performed in MRI scanner using specialized software, installed on the scanner, e.g. see Siemens BLADE and 3D PACE procedures Hirokawa et al. (2008); Pipe (1999). These methods could slightly vary from one manufacturer to another and in different software versions. Among them there are approaches to motion correction Pipe (1999), field inhomogeneity correction Simmons et al. (1994); Vovk et al. (2007) and phase error correction Sven et al. (2013).

The obtained images could be further preprocessed using neuroimaging software (FreeSurfer (2018), SPM12 (2018), FSL (2018), and other brain imagery processing software toolboxes Behroozi and Daliri (2012); Bernstein et al. (2018)) in order to perform other types of correction Klein et al. (2009), increase signal-to-noise ratio and exclude data artefacts, see, for example, Salimi-Khorshidi et al. (2014); Sladky et al. (2011); Thirion et al. (2006).

fMRI data cleaning. fMRI data is represented as a sequence of weighted (see Section 2) images with lower than structural MRI spatial resolution, usually sampled every seconds. These images should also be preprocessed in order to exclude different sources of noise/artefacts both in scanner during acquisition to remove low-level hardware artefacts and after scanning in neuroimaging software (SPM12 (2018), FSL (2018), Afni (2018), ArtRepair (2018) and other brain imaging toolboxes Behroozi and Daliri (2012); Bernstein et al. (2018)).

Initial fMRI data has complex multidimensional spatiotemporal structure and consists of recorded multidimensional time series, each component of which characterizes brain activity associated with blood flow (hemodynamic response) related to energy consumption by active cell clusters at specific brain voxel Huettel et al.

. These measurements contain not only brain activity but also noise caused by various artefacts such as physiological (cardiac and respiratory) and non-physiological (movement, scanning artefacts, etc.) sources. In order to remove noise Independent Component Analysis (ICA)

Hyvarinen (1999)

is used; extracted components are ordered according to the amount of explained variance; many of the first components will not contain signal of interest. Often there is more noise components than signal, for example,

of noise components Smith and et al. (2013) and of noise components Griffanti and et al. (2014) for multiband sequences, for extended discussion see Sharaev et al. (2018a).

4.2 Constructing the subject-oriented (a priori domain-specific) features

The goals of this sub-stage is to extract informative features (biomedically significant brain characteristics, clinically meaningful features) with lower dimensionality. The approach is typically realized in several steps:

  • selection of an appropriate brain atlas Jean and Tournoux (1988); Maldjian and Laurienti (2003); Klein (2009); Xu et al. (2017); Atlas ; Wiki which splits the brain into the anatomical areas (e.g. Hippocampi, cortical areas and etc.),

  • 3D MRI/4D fMRI images segmentation into disjoint sets (sub-images), consisting of voxels, corresponding to different brain regions (Regions of Interest, ROIs),

  • various characteristics calculation for each ROI or interaction (connectivity) between ROIs

Examples of such characteristics:

  • structural morphometric parameters (volumes, thicknesses, curvatures) of the selected anatomical areas from the MRI-image, which together form a volumetric vector. For example, MRI processing toolbox FreeSurfer (2018) parcels MRI images into regions corresponding to the chosen Desikan-Killiany atlas; calculates volumetric characteristics for each cortical region (NVoxels, Volume_mm3, normMean, normStdDev, normMin, normMax, normRange) and geometric characteristics of subcortical regions (NumVert, SurfArea, GrayVol, ThickAvg, ThickStd, MeanCurv, GausCurv, FoldInd, CurvInd);

  • functional connectivity parameters (see CONN (2018); Nilearn (2018)), which describe interactions between various functional areas and are based on various measures of dependency like Pearson correlation (or spectral coherence, mutual information) between time series of fMRI signals, which measure brain activity in chosen voxels from considered areas obtained from resting-state fMRI. These parameters are described by symmetric functional connectivity matrices (or undirected graphs). Functional connectivity graphs are then analyzed with special python software libraries such as Networkx (2018). Thus, functional connectivity of each ROI could be represented via several basic graph features (clustering coefficient, local/global efficiency, degree/ closeness/betweenness centrality, average neighbor degree, etc.);

  • effective connectivity parameters (under causation concept) describing “the influence one neural system exerts over another either directly or indirectly” Friston et al. (2003). This explicitly means that all links between brain areas have some direction, thus the brain connectome could be considered as a directed graph representing connectivity among neurons within the network and the information about the direction of information transfer is preserved. Methods of assessing effective connectivity are being developed nowadays, among them are model-based approaches, like dynamic causal modelling (DCM) Friston et al. (2003); Sharaev et al. (2016a); Ushakov et al. (2016) and model-free approaches based on information theory Montalto et al. (2014); Sharaev et al. (2016b, 2018b).

For constructed objects (brain areas, symmetric connectivity matrices/undirected graphs, causal directed graphs) different characteristics reflecting meaningful properties of these objects, can be computed for further use in Machine learning studies:

  • segments of MRI-image consisting of 3D MRI-voxels from chosen brain areas (to be used as inputs for deep learning procedures

    Suk et al. (2016); Plis et al. (2014); Ravi et al. (2017));

  • various vector characteristics of undirected graphs (see Bctnet ) with components describing various graph properties such as global/local node efficiency, cost, betweenness centrality, etc. Wang et al. (2010);

  • vectors consisting of persistent homology characteristics (such as Betti numbers, Euler characteristics, etc.) of directed flag complex Edelsbrunner et al. (2002); Zomorodian and Carlsson (2005); Edelsbrunner and Harer (2008), which are computed from the directed connectivity graphs (such characteristics are used in analysis of brain connectomes Bullmore and Sporns (2009); Reimann et al. (2015); Snasel et al. (2017); Garg (2017)).

Most often domain-specific lower dimensional features (morphometric or functional connectivity features) could be extracted from original data in specialized MRI processing toolboxes Behroozi and Daliri (2012); Bernstein et al. (2018).

4.3 Low-dimensional representations of domain-specific features

Although the dimensionality of constructed domain-specific features and corresponding characteristics can be low in comparison with the initial data, it can nevertheless be rather high. For example, volumetric vector, computed by toolbox FreeSurfer (2018), provides components. The size of connectivity matrix, computed by toolboxes CONN (2018); Nilearn (2018) is (in accordance with the chosen brain atlas); for each node graph characteristics (measures of nodes centrality, local efficiency and others) and two “global” graph characteristics (characteristic path length and global efficiency) are computed producing a vector of dimensionality Wang et al. (2017).

Luckily, these data, as well as the most real-world high-dimensional data obtained from “natural” sources (including MRI and fMRI data), due to dependencies between its components and various constraints on their values, do not fill the whole full-dimensional space and occupies only a very small domain with smaller intrinsic dimension. Thus, such high-dimensional data can be transformed into some lower-dimensional representations (or features) using various Feature extraction/Dimensionality reduction algorithms Sorzano et al. (2014); Bernstein and Kuleshov (2014).

If the data is concentrated near a linear low-dimensional affine subspaces, various linear methods can be used such as Principal Component Analysis (PCA)

Jollie (2002), Independent Component Analysis (ICA) Hyvarinen (1999), Projection Pursuit Friedman and Tukey (1974), etc. But in many cases the “low-dimensional area” is essentially nonlinear and requires using advanced nonlinear Feature extraction/Dimensionality reduction algorithms. The most popular model of high-dimensional data, which occupy a small part of observation space, is a Manifold model in accordance with which the data is located near an unknown Data manifold of lower dimension, embedded in an ambient high-dimensional input space Vapnik ; this manifold model can effectively represent the brain anatomy as well Gareth et al. . Dimensionality reduction algorithms under this model, called Manifold learning Bishop are widely used for medical data preprocessing including MRI/fMRI data Liu et al. (2017); Shen and Meyer (2008).

5 Machine Learning/data analysis pipeline

Figure 1: Flow diagram illustrating the classification pipeline
Figure 2: Inner loop recursive diagram illustrating the classifier hyper-parameters grid search

According to the proposed pipeline schema (see figure 1 for details), the MRI data were cleaned, preprocessed and their features were extracted using MRI processing toolboxes Behroozi and Daliri (2012); Bernstein et al. (2018).

Structural morphometric features were calculated from images using FreeSurfer (2018); for more than brain regions corresponding features explaining brain structure (volumes, surface areas, thicknesses, etc.) were computed producing a vector with features for each subject.

Functional connectivity matrices were calculated from EPI MRI sequences using Nilearn (2018) toolbox; functional connectivity matrix was considered as a graph with nodes in the corresponding regions of interest (ROI); for each node basic graph features (local/global efficiency, betweenness centrality, etc.) were calculated, thus producing a vector with features for each subject.

These datasets were investigated separately in order to evaluate informative content of each dataset. The Machine learning exploratory pipeline was realized in IPython using Sklearn library ( and organized as follows (see figure 2 for details):

  • We considered two geometrical methods for dimensionality reduction: 1) Locally Linear embedding; 2) Principal Component Analysis.

  • We considered two methods of feature selection: 1) Feature selection with SelectKBest() function, based on Pearson’s chi-squared test and ANOVA scoring; 2) Selection of relevant features based on a particular classification model via the Sklearn function SelectFromModel(), used with Logistic Regression (LR), K-Nearest Neighbors (KNN) and Random Forest Classifier (RFC).

  • We performed grid search for a number of selected features in the set and for a number of components in dimension reduction procedure in the set .

Data was whitened before training. Feature reduction was performed without double-dipping

Mwangi et al. (2014), therefore training and testing datasets are separated before feature selection/dimensionality reduction. Hyper-parameters grid search was based on cross-validation with stratification, repeated times for each person being in test.

5.1 Classification of the Epilepsy

In table 1 we provide results for EvsH classification task.

Functional connectivity graph Brain morphometry features (MRI)
based features (fMRI) features (MRI)
False Positive Rate True Positive Rate False Positive Rate True Positive Rate
10% 16% 10% 35%
15% 24% 15% 44%
20% 36% 20% 55%
30% 52% 30% 71%
Table 1: Classification of Epilepsy versus Healthy Control (EvsH, persons) based on the structural and functional MRI features

We obtain the most accurate results when using MRI structural features. Now let us consider classification of patients with temporal lobe epilepsy (TLE) with and without magnetic resonance imaging (MRI) evidence for structural lesion (TLE MRI Positive/ TLE MRI Negative). In table 2 we provide results for classification of MRI positive TLE versus Healthy Control (TLEvsHC, persons).

Epilepsy versus Healthy Control Temporal Lobe Epilepsy versus Healthy
(EvsH, 25/25 person) Control (TLEvsHC, 30/25 persons)
False Positive Rate True Positive Rate False Positive Rate True Positive Rate
10% 35% 10% 50%
15% 44% 15% 57%
20% 55% 20% 87%
30% 71% 30% 87%
Table 2: Classification of MRI positive TLE versus Healthy Control using structural MRI

In table 3 we consider results of classification of MRI Negative TLE versus Healthy Control (TLENvsH, persons).

MRI Positive TLE versus Healthy Control MRI Negative TLE versus Healthy Control
(TLEPvsH, person) (TLENvsH, person)
False Positive Rate True Positive Rate False Positive Rate True Positive Rate
10% 12% 10% 93%
15% 50% 15% 93%
20% 63% 20% 93%
30% 75% 30% 93%
Table 3: Classification of MRI negative TLE versus Healthy Control using structural MRI

Thus we can see that when dividing the TLE group into the positive and negative subsets we obtained that MRI Negative TLE classification shows considerably higher sensitivity and specificity then MRI Positive TLE. Thus, further investigation of extracted features can shed light on differences of subsets and explain these findings.

The most important features for MRI Positive classification are Right Cerebellum, Precuneus, Left Accumbens and Right Putamen. The most important features for MRI Negative TLE are Right and Left Amygdala, Frontal pole, Insula, Left Cerebellum, Parsorbitalis and Isthmus cingulate. Thus, the best classifier was constructed for TLE Negative Epilepsy classification. Its sensitivity is equal to and specificity is equal to .

5.2 Classification of Depression

There are papers indicating that in case of depression special patterns in brain structure can be recognized Wise and et al. (2016); Kim and Na (2018). In table 4 we provide results of depression classification using either structural MRI or functional fMRI data. We can see that fMRI data provides more informative biomarkers of depressive disorders.

Brain morphometry Functional connectivity graph
features (MRI) based features (fMRI)
False Positive Rate True Positive Rate False Positive Rate True Positive Rate
10% 22% 10% 12%
15% 32% 15% 40%
20% 47% 20% 64%
30% 65% 30% 80%
Table 4: Classification of Depression versus Healthy Control (DvsH, 25/25 person)

The most important features for fMRI-based depression classification are Left Caudate, Left Temporal Pole, Right Insula and Right Superior Occipital gyrus. The best accuracy, achieved for depression classification, has sensitivity and specificity.

6 Conclusions

In this paper we proposed a data analysis pipeline for processing of MRI/fMRI data and diagnostic classification on its basis. We verified the pipeline by identifying biomarkers, relevant for detection of epilepsy and depression with sufficiently high accuracy. Further research direction will be to develop non-parametric algorithms for classification quality assessment (accuracy evaluation) based on conformal prediction framework and consider various topological features of MRI/fMRI data as biomarkers.

This study was performed in the scope of the Project “Machine Learning and Pattern Recognition for the development of diagnostic and clinical prognostic prediction tools in psychiatry, borderline mental disorders, and neurology” (a part of the Skoltech Biomedical Initiative program).


  • Afni (2018) Afni. Afni toolbox. 2018. URL
  • Altman (1992) N. S. Altman. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3):175–185, 1992. ISSN 00031305. URL
  • ArtRepair (2018) ArtRepair. Artrepair software. center for interdisciplinary brain sciences research, stanford medicine. 2018. URL
  • (4) Harvard-Oxford Atlas. Harvard-oxford atlas – brede wiki. URL
  • (5) Bctnet. Brain connectivity toolbox. URL
  • Behroozi and Daliri (2012) M. Behroozi and M.R. Daliri. Software tools for the analysis of functional magnetic resonance imaging. Basic Clin. Neurosci., 3(5):71–83, 2012.
  • Bernstein and Kuleshov (2014) A. Bernstein and A. Kuleshov. Low-dimensional data representation in data analysis. In N. El Gayar, F. Schwenker, and C. Suen, editors, Artificial Neural Networks in Pattern Recognition (ANNPR-2014). Lecture Notes in Computer Science, volume 8774, pages 47–58. Springer, 2014.
  • Bernstein et al. (2018) A. Bernstein, R. Akzhigitov, E. Kondrateva, S. Sushchinskaya, and V. Samotaeva, I. amd Gaskin. Mri brain imagery processing software in data analysis. In Petra Perner, editor, Advances in Mass Data Analysis of Images and Signals in Medicine, Biotechnology, Chemistry and Food Industry. Proceedings of 13th International Conference on Mass Data Analysis of Images and Signals (MDA 2018). Springer, 2018.
  • (9) C.M. Bishop. Pattern Recognition and Machine Learning. Heidelberg, Springer.
  • Bruijne (2016) M. de Bruijne. Machine learning approaches in medical image analysis: From detection to diagnosis. Med. Image Anal., 33:94–97, 2016.
  • Buchstaber and T.E. (2015) V.M. Buchstaber and Panov T.E. Toric topology. Mathematical Surveys and Monographs. Amer. Mathematical Society, 205, 2015.
  • Bullmore and Sporns (2009) E. Bullmore and O. Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci., 10:186–198, 2009.
  • Burges (2010) Christopher J. C. Burges. Dimension reduction: A guided tour. Foundations and Trends® in Machine Learning, 2(4):275–365, 2010. ISSN 1935-8237. doi: 10.1561/2200000002. URL
  • Chernova and Burnaev (2015) S.S. Chernova and E.V. Burnaev. On an iterative algorithm for calculating weighted principal components. Journal of Communications Technology and Electronics, 60(6):619–624, Jun 2015. ISSN 1555-6557. doi: 10.1134/S1064226915060042. URL
  • CONN (2018) CONN. Conn toolbox – functional connectivity toolbox. 2018. URL
  • Cortes and Vapnik (1995) Corinna Cortes and Vladimir Vapnik. Support-vector networks. Mach. Learn., 20(3):273–297, September 1995. ISSN 0885-6125. doi: 10.1023/A:1022627411411. URL
  • Edelsbrunner and Harer (2008) H. Edelsbrunner and J. Harer. Persistent homology – a survey. Surveys on Discrete and Computational Geometry. Contemporary Mathematics (eds. Goodman, J.E., Pach, J., Pollack R.), 453(5):257–282, 2008.
  • Edelsbrunner et al. (2002) H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification. Discrete Comput. Geom., 28:511–533, 2002.
  • Erasmus et al. (2004) L.J. Erasmus, D. Hurter, M. Naude, H.G. Kritzinger, and S. Acho. A short overview of mri artefacts. SA J. Radiol., 8:13–17, 2004.
  • Erofeev and Burnaev (2016) P. D. Erofeev and E. V. Burnaev. The influence of parameter initialization on the training time and accuracy of a nonlinear regression model. Journal of Communications Technology and Electronics, 61(6):646–660, Jun 2016. ISSN 1555-6557. doi: 10.1134/S106422691606005X. URL
  • FreeSurfer (2018) FreeSurfer. Freesurfer toolbox – an open source software suite for processing and analyzing (human) brain mri images. 2018. URL
  • Friedman and Tukey (1974) J.H. Friedman and J.W. Tukey. A projection pursuit algorithm for exploratory data analysis. IEEE Trans. of Computers, 23(9):881–890, 1974.
  • Friston et al. (2003) K.J. Friston, L. Harrison, and W. Penny. Dynamic causal modeling. Neuroimage, 19:1273–1302, 2003.
  • FSL (2018) FSL. Fsl toolbox. neuroimaging. 2018. URL
  • Fu and Costafreda (2013) Cynthia H Y Fu and Sergi G Costafreda. Neuroimaging-based biomarkers in psychiatry: clinical opportunities of a paradigm shift. Canadian Journal of Psychiatry, 58(9):499–508, 9 2013. ISSN 0706-7437.
  • (26) James Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning with Applications in R. New-York, Springer Texts in Statistics.
  • Garg (2017) A. Garg. Novel geometry and function based topological data analysis in neuroimaging data. In Engineering Science - Theses, Dissertations, and other Required Graduate Degree Essays, 2017.
  • Geurts et al. (2006) Pierre Geurts, Damien Ernst, and Louis Wehenkel. Extremely randomized trees. Mach. Learn., 63(1):3–42, April 2006. ISSN 0885-6125. doi: 10.1007/s10994-006-6226-1. URL
  • Ghosh et al. (2018) Swarnendu Ghosh, Nibaran Das, Teresa Goncalves, Paulo Quaresma, and Mahantapas Kundu. The journey of graph kernels through two decades. Computer Science Review, 27:88–111, 2018.
  • Glenn and Vovk (2008) Shafer Glenn and Vladimir. Vovk. A tutorial on conformal prediction. Journal of Machine Learning Research, 9:371–421, 2008.
  • Griffanti and et al. (2014) L. Griffanti and et al. Ica-based artefact removal and accelerated fmri acquisition for improved resting state network imaging. Neuroimage, 95:232–247, 2014.
  • Harris et al. (2011) Papadopoulos Harris, Vladimir Vovk, and Alexander. Gammerman. Regression conformal prediction with nearest neighbours. Journal of Machine Learning Research, 40:815–840, 2011.
  • (33) Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
  • Hirokawa et al. (2008) Y. Hirokawa, H. Isoda, Y.S. Maetani, S. Arizono, K. Shimada, and K. Togashi. Mri artifact reduction and quality improvement in the upper abdomen with propeller and prospective acquisition correction (pace) technique. Am. J. Roentgenol., 191:1154–1158, 2008.
  • (35) D.W. Hosmer and Stanley Lemeshow. Applied Logistic Regression. New York: Wiley.
  • (36) S.A. Huettel, A.W. Song, and G. McCarthy. Functional magnetic resonance imaging, Magnetic Resonance Imaging. MA: Sinauer Associates.
  • Hyvarinen (1999) A. Hyvarinen. Survey on independent component analysis. Neural Computing Surveys, 2:94–128, 1999.
  • Jean and Tournoux (1988) T. Jean and P. Tournoux. Co-Planar Stereotaxic Atlas of the Human Brain: 3-D Proportional System: An Approach to Cerebral Imaging. 1988.
  • Jollie (2002) T. Jollie. Principal Component Analysis. Springer, New York, 2002.
  • Kim and Na (2018) Y.K. Kim and K.S. Na. Application of machine learning classification for structural brain mri in mood disorders: Critical review from a clinical perspective. Prog. Neuro-Psychopharmacology Biol. Psychiatry, 80:71–80, 2018.
  • Klein et al. (2009) A. Klein, J. Andersson, B.A. Ardekani, J. Ashburner, B. Avants, M.C. Chiang, G.E. Christensen, D.L. Collins, J. Gee, P. Hellier, J.H. Song, M. Jenkinson, C. Lepage, D. Rueckert, P. Thompson, T. Vercauteren, R.P. Woods, J.J. Mann, and R. V. Parsey. Evaluation of 14 nonlinear deformation algorithms applied to human brain mri registration. Neuroimage, 46:786–802, 2009.
  • Klein (2009) A. et al. Klein. Evaluation of 14 nonlinear deformation algorithms applied to human brain mri registration. NeuroImage, 46(3):786–802, 2009.
  • Lachenbruch and Mickey (1968) P.A. Lachenbruch and M.R. Mickey. Estimation of error rates in discriminant analysis. Technometrics, 10(1):1–11, 1968.
  • Liaw and Wiener (2002) Andy Liaw and Matthew Wiener. Classification and Regression by randomForest. R News, 2(3):18–22, 2002. URL
  • Liu et al. (2017) C. Liu, J. JaJa, and L. Pessoa. Leica: Laplacian eigenmaps for group ica decomposition of fmri data. NeuroImage, 12, 2017.
  • Maldjian and Laurienti (2003) J.A. Maldjian and P.J. Laurienti. An automated method for neuroanatomic and cytoarchitectonic atlas-based interrogation of fmri data sets. NeuroImage, 19(3):1233–1239, 2003.
  • Montalto et al. (2014) A. Montalto, L. Faes, and D. Marinazzo. Mute: a matlab toolbox to compare established and novel estimators of the multivariate transfer entropy. PLoS One, 9:1–18, 2014.
  • Mwangi et al. (2014) Benson Mwangi, Tian Siva Tian, and Jair C. Soares. A review of feature reduction techniques in neuroimaging. Neuroinformatics, 12(2):229–244, 2014.
  • Nazarov and Burnaev (2016) I. Nazarov and E. Burnaev.

    Conformalized kernel ridge regression.

    In 15th IEEE International Conference on Machine Learning and Applications, ICMLA 2016, Anaheim, CA, USA, December 18-20, 2016, pages 45–52. IEEE, 2016. doi: 10.1109/ICMLA.2016.0017. URL
  • Networkx (2018) Networkx. Networkx – software for complex networks. 2018. URL
  • Nilearn (2018) Nilearn. Nilearn toolbox – machine learning for neuroimaging in python. 2018. URL
  • Notchenko et al. (2018) Alexandr Notchenko, Yermek Kapushev, and Evgeny Burnaev. Large-scale shape retrieval with sparse 3d convolutional neural networks. In Wil M.P. van der Aalst, Dmitry I. Ignatov, Michael Khachay, Sergei O. Kuznetsov, Victor Lempitsky, Irina A. Lomazova, Natalia Loukachevitch, Amedeo Napoli, Alexander Panchenko, Panos M. Pardalos, Andrey V. Savchenko, and Stanley Wasserman, editors, Analysis of Images, Social Networks and Texts, pages 245–254, Cham, 2018. Springer International Publishing. ISBN 978-3-319-73013-4.
  • Papanov et al. (2015) A. Papanov, P. Erofeev, and E. Burnaev. Influence of resampling on accuracy of imbalanced classification. In A. Verikas, P. Radeva, and D. Nikolaev, editors, Proc. SPIE 9875, Eighth International Conference on Machine Vision, Barcelona, Spain (December 8, 2015), volume 9875. SPIE, 2015.
  • Pipe (1999) J.G. Pipe. Motion correction with propeller mri: Application to head motion and free-breathing cardiac imaging. Magn. Reson. Med., 42:963–969, 1999.
  • Plis et al. (2014) S.M. Plis, D.R. Hjelm, and et al. Deep learning for neuroimaging: A validation study. Frontiers in Neuroscience, 12, 2014.
  • Prikhod’ko and Burnaev (2013) P. V. Prikhod’ko and E. V. Burnaev. On a method for constructing ensembles of regression models. Automation and Remote Control, 74(10):1630–1644, Oct 2013. ISSN 1608-3032. doi: 10.1134/S0005117913100044. URL
  • Ravi et al. (2017) D. Ravi, C. Wong, and et al. Deep learning for health informatics. IEEE Journal of Biomedical and Health Informatics, 21(1):4–21, 2017.
  • Reimann et al. (2015) M. W. Reimann, J. G. King, E. B. Muller, S. Ramaswamy, , and H. Markram. An algorithm to predict the connectome of neural microcircuits. Front. Comput. Neurosci., 9(120):186–198, 2015.
  • Salimi-Khorshidi et al. (2014) G. Salimi-Khorshidi, G. Douaud, C.F. Beckmann, M.F. Glasser, L. Griffanti, and S.M. Smith. Automatic denoising of functional mri data: Combining independent component analysis and hierarchical fusion of classifiers. Neuroimage, 90:449–468, 2014.
  • Sharaev et al. (2018a) M. Sharaev, A. Andreev, A. Artemov, A. Bernstein, E. Burnaev, E. Kondratyeva, S. Sushchinskaya, and R. Akzhigitov. fmri: preprocessing, classification and pattern recognition. In Conformal and Probabilistic Prediction and Applications. Springer, 2018a.
  • Sharaev et al. (2018b) M. Sharaev, V. Orlov, and et al. Information transfer between rich - club structures in the human brain. Procedia Comput. Sci., 123:440–445, 2018b.
  • Sharaev et al. (2016a) Maksim Sharaev, Vadim Ushakov, and Boris Velichkovsky. Causal interactions within the default mode network as revealed by low-frequency brain fluctuations and information transfer entropy. In Alexei V. Samsonovich, Valentin V. Klimov, and Galina V. Rybina, editors, Biologically Inspired Cognitive Architectures (BICA) for Young Scientists, pages 213–218, Cham, 2016a. Springer International Publishing. ISBN 978-3-319-32554-5.
  • Sharaev et al. (2016b) M.G. Sharaev, V. Zavyalova, and et al. Effective connectivity within the default mode network: dynamic causal modeling of resting-state fmri data. Front. Hum. Neurosci., 10:14, 2016b.
  • Shawe-Taylor and Cristianini (2004) John Shawe-Taylor and Nello Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, New York, NY, USA, 2004. ISBN 0521813972.
  • Shen and Meyer (2008) X. Shen and F. Meyer. Low-dimensional embedding of fmri datasets. NeuroImage, 41(3):886–902, 2008.
  • Shen and Meyer (2007) X. Shen and F. G. Meyer. Low Dimensional Embedding of fMRI datasets. ArXiv e-prints, September 2007.
  • Simmons et al. (1994) A. Simmons, P.S. Tofts, G.J. Barker, and S.R. Arridge. Sources of intensity nonuniformity in spin echo images at 1.5 t. Magn. Reson. Med., 32:121–128, 1994.
  • Sladky et al. (2011) R. Sladky, K.J. Friston, J. Trostl, R. Cunnington, E. Moser, and C. Windischberger. Slice-timing effects and their correction in functional mri. Neuroimage, 58:588–594, 2011.
  • Smith and et al. (2013) S.M. Smith and et al. Functional connectomics from resting-state fmri. Trends Cogn. Sci., 2013.
  • Smolyakov and Burnaev (2016) D. Smolyakov and E. Burnaev. One-class SVM with privileged information and its application to malware detection. In Carlotta Domeniconi, Francesco Gullo, Francesco Bonchi, Josep Domingo-Ferrer, Ricardo A. Baeza-Yates, Zhi-Hua Zhou, and Xindong Wu, editors, IEEE International Conference on Data Mining Workshops, ICDM Workshops 2016, December 12-15, 2016, Barcelona, Spain., pages 273–280. IEEE Computer Society, 2016. doi: 10.1109/ICDMW.2016.0046. URL
  • (71) D. Smolyakov, P. Erofeev, and E. Burnaev. Model selection for anomaly detection. In A. Verikas, P. Radeva, and D. Nikolaev, editors, Proc. SPIE 9875, Eighth International Conference on Machine Vision, Barcelona, Spain (December 8, 2015). SPIE, year = 2015, volume = 9875.
  • Snasel et al. (2017) V. Snasel, J. Nowakov, F. Xhafa, and L. Barolli. Geometrical and topological approaches to big data. Future Generation Computer Systems, 67:286–296, 2017.
  • Sorzano et al. (2014) C. O. S. Sorzano, J. Vargas, and A. P. Montano. A survey of dimensionality reduction techniques. ArXiv e-prints, March 2014.
  • SPM12 (2018) SPM12. Spm12 – statistical parametric mapping toolbox. 2018. URL
  • Suk et al. (2016) H.I. Suk, C.Y. Wee, and et al. State-space model with deep learning for functional dynamics estimation in resting-state fmri. NeuroImage, 129:292–307, 2016.
  • Sven et al. (2013) J. Sven, R. Heidemann, and A. Petrovic. A retrospective, fully automated and fast method for intensity inhomogeneity correction in 7t mri. Proc. Intl. Soc. Mag. Reson. Med., 21:3788, 2013.
  • Thirion and Faugeras (2004) B. Thirion and O. Faugeras. Nonlinear dimension reduction of fmri data: the laplacian embedding approach. In 2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro (IEEE Cat No. 04EX821), pages 372–375 Vol. 1, April 2004. doi: 10.1109/ISBI.2004.1398552.
  • Thirion et al. (2006) B. Thirion, G. Flandin, P. Pinel, A. Roche, P. Ciuciu, and J.B. Poline. Dealing with the shortcomings of spatial normalization: Multi-subject parcellation of fmri datasets. Hum. Brain Mapp., 27:678–693, 2006.
  • Ushakov et al. (2016) Vadim Ushakov, Maksim G. Sharaev, Sergey I. Kartashov, Viktoria V. Zavyalova, Vitaliy M. Verkhlyutov, and Boris M. Velichkovsky. Dynamic causal modeling of hippocampal links within the human default mode network: Lateralization and computational stability of effective connections. Frontiers in Human Neuroscience, 10:528, 2016. ISSN 1662-5161. doi: 10.3389/fnhum.2016.00528.
  • (80) Vladimir Vapnik. Statistical Learning Theory. New-York, John Wiley.
  • Vovk et al. (2007) U. Vovk, F. Pernus, and B. Likar. A review of methods for correction of intensity inhomogeneity in mri. IEEE Trans. Med. Imaging., 26(3):405–421, 2007.
  • Vovk and Burnaev (2014) V. Vovk and E. Burnaev. Efficiency of conformalized ridge regression. CoRR, abs/1404.2083, 2014. URL
  • Vovk et al. (2005) Vladimir Vovk, Alex Gammerman, and Glenn. Shafer. Algorithmic Learning in a Random World. Springer, New York, 2005.
  • Wang et al. (2010) Jinhui Wang, Xinian Zuo, and Yong He. Graph-based network analysis of resting-state functional mri. In Front. Syst. Neurosci., 2010.
  • Wang et al. (2017) X. Wang, Y. Ren, and W. Zhang. Depression disorder classification of fmri data using sparse low-rank functional brain network and graph-based features. Comput. Math. Methods Med., 2017.
  • (86) Free Surfer Wiki. Fstutorial/anatomicalroi - free surfer wiki. URL
  • Wise and et al. (2016) T. Wise and et al. Common and distinct patterns of grey-matter volume alteration in major depression and bipolar disorder: evidence from voxel-based meta-analysis. Mol. Psychiatry, pages 1–9, 2016.
  • Wong (2015) T. Wong. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognition, 48(9):2839–2846, 2015.
  • Xu et al. (2017) N. Xu, R.N. Spreng, and et al. Initial validation for the estimation of resting-state fmri effective connectivity by a generalization of the correlation approach. Frontiers in Neuroscience, 11:271, 2017.
  • Zomorodian and Carlsson (2005) A. Zomorodian and G. Carlsson. Computing persistent homology. Discrete Comput. Geom., 33:249–274, 2005.