1 Introduction
Having been successfully applied in domains such as computer vision
[30], radar signal processing [5], and diffusion tensor imaging
[23]for years, the introduction of a Riemannian manifold of symmetric positivedefinite (SPD) matrices to brain signal analysis represents a powerful alternative to more traditional information extraction protocols. Only recently has it been shown that Riemannian BrainComputer Interface (BCI) methods outperform stateoftheart Euclidian spatial filtering and machine learning techniques
[9]. Five recent international BCI competitions – including last year’s Microsoft Cortana brain decoding challenge – have been won using Riemannian geometry [9, 3].Several reasons for this success have been proposed in the literature. First, in the form of covariance matrices, SPD matrices are understood to be excellent representations of the raw electrophysiological brain signal, while reducing its unwanted variations [20]. They have therefore become fundamental elements in methods such as common spatial pattern and canonical correlation analysis. Second, SPD matrices are traditionally treated within Euclidian frameworks, ignoring their intrinsic nonEuclidian structure. Neglecting this fundamental characteristic may lead to deficient results [1].
These points not only have been found advantageous in BCI design but also make a strong case for the use of a Riemannian SPD matrix manifold in the assessment of neuronal degeneration as can be found in Alzheimer’s disease (AD).
AD is the most common form of dementia and ultimately fatal. The combination of its severity and looming global epidemic scale – caused by the ageing of our society – makes AD a major public health concern [2]. Due to its degenerative nature, early accurate diagnosis and effective clinical monitoring are crucial. However, when it comes to routine clinical practice, AD assessment is most commonly done by subjective clinical interpretations at an already progressed stage of the disease. So far, no costeffective, widelyused biomarkers have been established to facilitate the objectivization of diagnosis and disease progression assessment. To promote the screening and monitoring of as many individuals as possible, such markers should not be dependent on costly equipment, such as MRI, or PET scanners. Therefore, we focus on inexpensive apparatuses, namely electroencephalography (EEG) devices. Their noninvasiveness and low noise level adds to their suitability for largescale use in irritable patients such as those found within the spectrum of AD. Additionally, research suggests that quantitative electroencephalography (QEEG) reflects neurodegenerative processes in AD (for reviews, see [32, 12, 10]).
Therefore, we aim to develop a Riemannian framework for QEEG markers of neuronal degeneration in AD and empirically investigate its usefulness. To be able to combine the merits of Riemannian geometry with the advantages of sophisticated Euclidean regularization and variable selection techniques like the elastic net (see 2.3), we map SPD matrices into the tangent space (see 2.2
). Sustaining the distance relationship of elements, this projection and a subsequent vectorization allows to treat SPD matrices as Euclidean entities.
All existing Riemannian brain signal analysis methods use covariance as measure of dependence, thereby implicitly assuming a multivariate Gaussian distribution of data and linear associations between the activities of brain regions. However, both properties might not be fulfilled
[29, 33]. Hence, we examine the usefulness of rank correlation for constructing SPD matrices – capturing nonlinear relationships in data that is often far from normally distributed. For model training and testing, we use one of the largest AD EEG data sets ever collected in a prospective manner, including MRI biomarkers of brain atrophy. As frequencyspecific QEEG information has been proven useful in the AD domain
[33], we furthermore analyze the effectiveness of a special type of spatiofrequential SPD matrix. Finally, to evaluate the real added value of tangent space mapping, we compare results achieved by this method with those achieved by using regular Euclidean procedures.2 Materials and Methods
2.1 Experimental data
AD patients were prospectively recruited at four tertiary memory clinics (Medical Universities of Graz, Innsbruck, Vienna, and the General Hospital Linz, PRODEM cohort study by the Austrian Alzheimer Society [27], supported by the Austrian Research Promotion Agency FFG, project no. 827462).
We exclusively analyzed participants (N = 110) who had a structural MRI scan within 60 days of the baseline EEG measurement, a maximal MMSE score [14] of 28, and a CDR [19] from 0.5 to 1.
Acquisition of structural T1weighted images was accomplished on 1.5 and 3 Tesla
MR scanners (Siemens). We used FreeSurfer volumetric analyses [13] to build two MRI biomarkers of brain atrophy, i.e., cerebral volume and hippocampal volume divided by the total intracranial volume (ratios referred to as BrainVol and HippVol).
Continuous EEG (alpha trace EEG recorder, 1020 electrode placement) was analyzed for an eyesclosed resting condition (EC, 180 sec; prediction of BrainVol) and the encoding period of a pairedassociate word list task (WLT, adapted version of [25], 140 sec; prediction of HippVol). Research has repeatedly shown (i) the importance of hippocampal activity during the WLT [7, 8], and (ii) the sensitivity of pairedassociative memory to early ADrelated changes [15, 24].
For details on the entire PRODEM experimental protocol and preprocessing pipeline, see [33, 17, 16].
2.2 Feature generation
To optimize the number of time points available, we determined the maximal signal length with guaranteed quasistationary properties using an augmented DickeyFuller test [11]
. The deartifacted signal was partitioned into 4sec segments accordingly. SPD matrices were estimated using sample covariance (SCM, see
1) and Kendall rank correlation (KEN, see [22]). In the following all procedures will be described by taking the example of sample covariance.EEG segments were considered as by matrices , being the number of electrodes, being the number of time samples. For each segment a spatial covariance matrix was estimated:
(1) 
To take distinct frequential aspects into account, we created a special type of matrix by bandpass filtering multiple times ( = 2 to 4 Hz, = 4 to 8 Hz = 8 to 13 Hz, = 13 to 15 Hz) and vertically concatenating the resulting signal to . Then, the sample covariance matrix was estimated:
(2) 
The common Euclidean distance () between two matrices and the corresponding mean () of several matrices can be defined as
(3) 
(4) 
where denotes the Frobenius norm. However, the Euclidean space suffers from several disadvantages, as – for instance – the averaging of SPD matrices may lead to a swelling effect (the determinant of the Euclidean mean can be strictly larger than the original determinants [1]). To avoid such artifacts from geometry, a more natural metric for SPD matrices, the LogEuclidean distance , with the corresponding mean , can be used (e.g., [34]):
(5) 
(6) 
Further, SPD matrices can be treated in their native Riemannian space using geodesic distance
and the Riemannian geometric mean, often referred to as Karcher mean
[21], , which minimizes the sum of squared :(7) 
(8) 
where
are the eigenvalues of
. As has no closedform solution for > 2 , we optimized it using the relaxed Richardson iteration [6]. For a review on the advantages of Riemannian geometry in brain signal processing and detailed formal definitions, see [9, 30].Sets of and were separately averaged on patient level using the aforementioned mean calculation methods (4, 6, 8) for both EEG paradigms (EC and WLT), and both measures of dependence (COV and KEN), resulting in subjectspecific matrices ( being a patient) for all variants. For Riemannian (TAN) and LogEuclidean (TAN) tangent spacebased features, of the corresponding mean type was mapped into the tangent space
(9) 
where was computed alternatively using (8)
for TAN or (6)
for TAN.
For a formal definition of the Riemannian tangent space, see [4, 30]. For the Euclidean control condition (EUC) both, the averaging on subject as well as group level, was done using (4).
Applying upper(.) as an operator vectorizing the upper triangular part of a SPD matrix, the feature vectors of , given = 19 resulting in 190 dimensions, and of , given = 19 resulting in 2926 dimensions, were created.
2.3 Elastic net regression and repeated nested crossvalidation
The elastic net [35]
was used as a regularization and variable selection technique to estimates a sparse regression model based on . It imposes a combination of the (lasso, [28]) and (ridge, [18]) penalties on regression coefficients. While enjoying a similar sparsity of representation as the lasso, the elastic net encourages a grouping effect, where strongly correlated predictors – as presumably present in our data set due to spatially adjacent electrode placement – tend to be in or out of the model together [35].
We used a twolevel nested crossvalidation
to determine generalization performance. The inner loop was included to sensibly choose a value for the regularization parameter with minimal expected generalization error [31]. The value that resulted in the lowest mean squared error (MSE) in the inner loop was used to fit models in the outer loop.
The parameter , representing the weight of lasso () versus ridge () optimization, was set at 0.5. Age and gender were introduced in BrainVol models, whereas the magnetic field strength (varying values of Tesla between centers might influence the analysis of smaller structures) was additionally introduced in the HippVol models. All variables were normalized before model fitting. To reduce the variability of prediction outcome resulting from random training–test set splitting, we repeated the entire nested crossvalidation procedure times. This allowed us to average out variability and report the range of results of multiple permutations [26].
3 Results and Discussion
Results are depicted in Table 1.
For both prediction problems (BrainVol, HippVol), the best models were of spatiofrequential nature (indicated in bold in the table), highlighting the importance of frequencyspecific information for QEEG AD markers. When comparing spatiofrequential models, the best tangent space mapping models significantly outperformed the Euclidean reference models (BrainVol, = 0.003; HippVol, = 0.030).
Differences between model performances were assessed by statistically comparing squared errors of test set predictions and averaging
values across repeated crossvalidation. Further, we calculated test statistics for evaluating the standalone performance of the best models (BrainVol,
= 0.003; Hippvol, = 0.011).For the prediction of BrainVol (measured during EC resting state) COV yielded lower rootmeansquare errors (RMSE) than KEN. Information on the the magnitude of the signal at certain sites – which is present in the diagonal elements of COV but not KEN matrices – seem to be essential. Whereas for hippocampusmediated memory encoding during the WLT, the interaction between brain regions (neuronal networks), as measured by offdiagonal matrix elements, seem to be of predominant importance – explaining the superior results for KEN. Further, the EEG signal during an active eyesopen task is presumably less normally distributed (even after sophisticated preprocessing) then during a resting EC period, additionally explaining deviating results for COV and KEN.
Interestingly, TAN achieved better results than TAN. Barachant [3] also inter alia used tangent space mapping with a LogEuclidean reference point () for winning Microsoft’s ’mind reading’ challenge. Should future studies support the superiority – or at least equality – of TAN, computational cost could be dramatically decreased due to the algorithmic simplicity of LogEuclidean as compared to Riemannian mean calculation.
To the best of our knowledge, this is the first article reporting a Riemannian approach for building QEEG markers of neuronal degeneration.
BrainVol  HippVol  

Dep  Approach  Design  RMSE  Min  Max  RMSE  Min  Max 
EUC  SF  1.70E03  1.57E03  2,53E03  2.09E07  1.63E07  4.26E07  
COV  TAN  SF  1.23E03  1.10E03  1.37E03  1.86E07  1.71E07  2.04E07 
TAN  SF  1.43E03  1.27E03  1,63E03  1.80E07  1.67E07  2.04E07  
EUC  SF  1.75E03  1.61E03  2.11E03  1.78E07  1.65E07  2.05E07  
KEN  TAN  SF  1.56E03  1.43E03  1.77E03  1.44E07  1.30E07  1.44E07 
TAN  SF  1.58E03  1.41E03  1.91E03  1.47E07  1.31E07  1.65E07  
EUC  S  2.02E03  1.57E03  3.80E03  1.69E07  1.55E07  2.17E07  
COV  TAN  S  1.34E03  1.25E03  1.50E03  1.77E07  1.62E07  2.01E07 
TAN  S  1.35E03  1.24E03  1.46E03  1.75E07  1.56E07  2.10E07  
EUC  S  1.73E03  1.59E03  1.90E03  1.68E07  1.51E07  1.95E07  
KEN  TAN  S  1.56E03  1.43E03  1.79E03  1.79E07  1.63E07  2.18E07 
TAN  S  1.55E03  1.44E03  1.73E03  1.81E07  1.63E07  2.56E07 
References
 Arsigny et al. [2007] Arsigny, V., Fillard, P., Pennec, X., & Ayache, N. 2007. Geometric Means in a Novel Vector Space Structure on Symmetric Positive‐Definite Matrices. SIAM Journal on Matrix Analysis and Applications, 29(1), 328–347.
 Association [2014] Association, Alzheimer’s. 2014. 2014 Alzheimer’s disease facts and figures. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 10(2), e47–e92.
 Barachant [30.10.2017] Barachant, A. 30.10.2017. Decoding Brain Signals 2016 – Microsoft Cortana Challenge . Retrieved from https://gallery.cortanaintelligence.com/Competition/DecodingBrainSignals2. Microsoft Cortana Website.
 Barachant et al. [2012] Barachant, A., Bonnet, S., Congedo, M., & Jutten, C. 2012. Multiclass Brain Computer Interface Classification by Riemannian Geometry. Biomedical Engineering, IEEE Transactions on, 59(4), 920–928.
 Barbaresco [2008] Barbaresco, F. 2008. Innovative tools for radar signal processing Based on Cartan’s geometry of SPD matrices & Information Geometry. Pages 1–6 of: 2008 IEEE Radar Conference.
 Bini & Iannazzo [2013] Bini, D. A., & Iannazzo, B. 2013. Computing the Karcher mean of symmetric positive definite matrices. Linear Algebra and its Applications, 438(4), 1700–1710.
 Cameron et al. [2001] Cameron, K. A., Yashar, S., Wilson, C. L., & Fried, I. 2001. Human Hippocampal Neurons Predict How Well Word Pairs Will Be Remembered.
 Clark et al. [2017] Clark, I. A., Kim, M., & Maguire, E. A. 2017. Confronting the elephant in the room  verbal paired associates and the hippocampus. bioRxiv.
 Congedo et al. [2017] Congedo, M., Barachant, A., & Bhatia, R. 2017. Riemannian geometry for EEGbased braincomputer interfaces; a primer and a review. BrainComputer Interfaces, 4(3), 155–174.
 Dauwels et al. [2011] Dauwels, J., Srinivasan, K., Ramasubba Reddy, M., Musha, T., Vialatte, F.B., Latchoumane, C., Jeong, J., & Cichocki, A. 2011. Slowing and loss of complexity in Alzheimer’s EEG: two sides of the same coin? International journal of Alzheimer’s disease, 2011.
 Dickey & Fuller [1979] Dickey, D. A., & Fuller, W. A. 1979. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association, 74(366a), 427–431.
 Drago et al. [2011] Drago, V., Babiloni, C., BartresFaz, D., Caroli, A., Bosch, B., Hensch, T., Didic, M., Klafki, H. W., Pievani, M., Jovicich, J., Venturi, L., Spitzer, P., Vecchio, F., Schoenknecht, P., Wiltfang, J., Redolfi, A., Forloni, G., Blin, O., Irving, E., Davis, C., Hardemark, H. G., & Frisoni, G. B. 2011. Disease tracking markers for Alzheimer’s disease at the prodromal (MCI) stage. J Alzheimers Dis, 26 Suppl 3, 159–99.
 Fischl [2012] Fischl, B. 2012. FreeSurfer. Neuroimage, 62(2), 774–781.
 Folstein et al. [1975] Folstein, M. F., Folstein, S. E., & McHugh, P. R. 1975. Minimental state. Journal of Psychiatric Research, 12(3), 189–198.
 Fowler et al. [2002] Fowler, K. S., Saling, M. M., Conway, E. L., Semple, J. M., & Louis, W. J. 2002. Paired associate performance in the early detection of DAT. Journal of the International Neuropsychological Society, 8(1), 58–71.
 Fruehwirt et al. [2017] Fruehwirt, W., Zhang, P., Gerstgrasser, M., Grossegger, D., Schmidt, R., Benke, T., DalBianco, P., Ransmayr, G., Weydemann, L., Garn, H., Waser, M., Osborne, M., & Dorffner, G. 2017. Bayesian Gaussian Process Classification from EventRelated Brain Potentials in Alzheimer’s Disease. Cham: Springer International Publishing. Pages 65–75.
 Garn et al. [2014] Garn, H., Waser, M., Deistler, M., Schmidt, R., DalBianco, P., Ransmayr, G., Zeitlhofer, J., Schmidt, H., Seiler, S., Sanin, G., Caravias, G., Santer, P., Grossegger, D., Fruehwirt, W., & Benke, T. 2014. Quantitative EEG in Alzheimer’s disease: cognitive state, resting state and association with disease severity. International Journal of Psychophysiology, 93(3), 390–7.
 Hoerl & Kennard [1970] Hoerl, A. E., & Kennard, R. W. 1970. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67.
 Hughes et al. [1982] Hughes, C. P., Berg, L., Danziger, W. L., Coben, L. A., & Martin, R L. 1982. A new clinical scale for the staging of dementia. The British Journal of Psychiatry, 140(6), 566–572.
 Kalunga et al. [2016] Kalunga, E. K., Chevallier, S.and Barthélemy, Q., Djouani, K., & Monacelli, E.and Hamam, Y. 2016. Online SSVEPbased BCI using Riemannian geometry. Neurocomputing, 191, 55–68.
 Karcher [1977] Karcher, H. 1977. Riemannian center of mass and mollifier smoothing. Communications on pure and applied mathematics, 30(5), 509–541.
 Kendall [1938] Kendall, M. G. 1938. A new measure of rank correlation. Biometrika, 30(1/2), 81–93.
 Pennec & Ayache [2006] Pennec, X.and Fillard, P., & Ayache, N. 2006. A Riemannian Framework for Tensor Computing. International Journal of Computer Vision, 66(1), 41–66.

Pike et al. [2008]
Pike, K. E., Rowe, C. C., Moss, S. A., & Savage, G. 2008.
Memory profiling with paired associate learning in Alzheimer’s disease, mild cognitive impairment, and healthy aging.
Neuropsychology, 22(6), 718–728.  Plihal & Born [1997] Plihal, W., & Born, J. 1997. Effects of early and late nocturnal sleep on declarative and procedural memory. Journal of cognitive neuroscience, 9(4), 534–547.
 Schouten et al. [2016] Schouten, T. M., Koini, M., de Vos, F., Seiler, S., van der Grond, J., Lechner, A., Hafkemeijer, A., Möller, C., Schmidt, R., de Rooij, M., & Rombouts, S. A. R. B. 2016. Combining anatomical, diffusion, and resting state functional magnetic resonance imaging for individual classification of mild and moderate Alzheimer’s disease.
 Seiler et al. [2012] Seiler, S., Schmidt, H., Lechner, A., Benke, T., Sanin, G., Ransmayr, G., Lehner, R., DalBianco, P., Santer, P., Linortner, P., Eggers, C., Haider, B., Uranues, M., Marksteiner, J., Leblhuber, F., Kapeller, P., Bancher, C., Schmidt, R., & Group, P. S. 2012. Driving Cessation and Dementia: Results of the Prospective Registry on Dementia in Austria (PRODEM). PLoS ONE, 7(12), e52710.
 Tibshirani [1996] Tibshirani, R. 1996. Regression shrinkage and selection via the lasso.
 Tong & Thakor [2009] Tong, S., & Thakor, N. V. 2009. Quantitative EEG analysis methods and clinical applications. Artech House.

Tuzel et al. [2007]
Tuzel, O., Porikli, F., & Meer, P. 2007.
Human Detection via Classification on Riemannian Manifolds.
Pages 1–8 of:
2007 IEEE Conference on Computer Vision and Pattern Recognition
.  Varma & Simon [2006] Varma, S., & Simon, R. 2006. Bias in error estimation when using crossvalidation for model selection.
 Vecchio et al. [2013] Vecchio, F., Babiloni, C., Lizio, R., Fallani Fde, V., Blinowska, K., Verrienti, G., Frisoni, G., & Rossini, P. M. 2013. Resting state cortical EEG rhythms in Alzheimer’s disease: toward EEG markers for clinical applications: a review. Suppl Clin Neurophysiol, 62, 223–36.
 Waser et al. [2016] Waser, M., Garn, H., Schmidt, R., Benke, T., DalBianco, P., Ransmayr, G., Schmidt, H., Seiler, S., Sanin, G., Mayer, F., Caravias, G., Grossegger, D., Fruhwirt, W., & Deistler, M. 2016. Quantifying synchrony patterns in the EEG of Alzheimer’s patients with linear and nonlinear connectivity markers. Journal of Neural Engineering, 123(3), 297–316.
 Yger et al. [2015] Yger, F., Lotte, F., & Sugiyama, M. 2015. Averaging covariance matrices for eeg signal classification based on the csp: an empirical study. Pages 2721–2725 of: 23rd European Signal Processing Conference (EUSIPCO). IEEE.
 Zou & Hastie [2005] Zou, H., & Hastie, T. 2005. Regularization and variable selection via the elastic net.