Deep Learning in current Neuroimaging: a multivariate approach with power and type I error control but arguable generalization ability

by   Carmen Jiménez-Mesa, et al.

Discriminative analysis in neuroimaging by means of deep/machine learning techniques is usually tested with validation techniques, whereas the associated statistical significance remains largely under-developed due to their computational complexity. In this work, a non-parametric framework is proposed that estimates the statistical significance of classifications using deep learning architectures. In particular, a combination of autoencoders (AE) and support vector machines (SVM) is applied to: (i) a one-condition, within-group designs often of normal controls (NC) and; (ii) a two-condition, between-group designs which contrast, for example, Alzheimer's disease (AD) patients with NC (the extension to multi-class analyses is also included). A random-effects inference based on a label permutation test is proposed in both studies using cross-validation (CV) and resubstitution with upper bound correction (RUB) as validation methods. This allows both false positives and classifier overfitting to be detected as well as estimating the statistical power of the test. Several experiments were carried out using the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the Dominantly Inherited Alzheimer Network (DIAN) dataset, and a MCI prediction dataset. We found in the permutation test that CV and RUB methods offer a false positive rate close to the significance level and an acceptable statistical power (although lower using cross-validation). A large separation between training and test accuracies using CV was observed, especially in one-condition designs. This implies a low generalization ability as the model fitted in training is not informative with respect to the test set. We propose as solution by applying RUB, whereby similar results are obtained to those of the CV test set, but considering the whole set and with a lower computational cost per iteration.


page 8

page 12


Efficient Decision Trees for Multi-class Support Vector Machines Using Entropy and Generalization Error Estimation

We propose new methods for Support Vector Machines (SVMs) using tree arc...

Inference for a test-negative case-control study with added controls

Test-negative designs with added controls have recently been proposed to...

Comparing AutoML and Deep Learning Methods for Condition Monitoring using Realistic Validation Scenarios

This study extensively compares conventional machine learning methods an...

Classifier comparison using precision

New proposed models are often compared to state-of-the-art using statist...

Cross-validation in high-dimensional spaces: a lifeline for least-squares models and multi-class LDA

Least-squares models such as linear regression and Linear Discriminant A...

The Theory Behind Overfitting, Cross Validation, Regularization, Bagging, and Boosting: Tutorial

In this tutorial paper, we first define mean squared error, variance, co...

Forecasting Future Humphrey Visual Fields Using Deep Learning

Purpose: To determine if deep learning networks could be trained to fore...

Please sign up or login with your details

Forgot password? Click here to reset