1 Introduction
We consider the problem of uncovering the intrinsic low rank component of a highdimensional dataset. We further focus on data that resides on a certain graph and the data changes smoothly between the connected vertices [1]. In many problems, the underlying graph is unknown or inexact [2, 3, 4]. For example, the graph may be estimated from the input data which is grossly corrupted. We propose an algorithm to estimate the lowrank component of the data, using the graph smoothness assumption to assist the estimation. Our algorithm also simultaneously and iteratively refines the graph to improve the effectiveness of the graph smoothness constraint, thereby increasing the quality of lowrank component estimation.
Highdimensional data is common in many engineering areas such as image / video processing, biomedical imaging, computer networks and transportation networks. We are specifically interested in automatic analysis of brain imaging data. In many cases, the task is to find the spatiotemporal neural signature of a task, by performing classification on cortical activations evoked by different stimuli [5, 6]
. Common brain imaging techniques are Electroencephalography (EEG) and Magnetoencephalography (MEG). These measurements are highdimensional spatiotemporal data. For instance, in our experiments, we use a recumbent Elekta MEG scanner with 306 sensors to record the brain activity for 1100 milliseconds. Furthermore, the measurements are degraded by various types of noise (e.g., sensor noise, ambient magnetic field noise, etc.) and the noise model is complicated (potentially nonGaussian). The highdimensionality and noise limit both the speed and accuracy of the signal analysis, that may result in unreliable signature modeling for classification. The highdimensionality of these signals also increases the complexity of the classifier.
Note that it has been recognized that there are patterns of anatomical links, or statistical dependencies or causal interactions between distinct units within a nervous system [7]. Some techniques have also been developed to estimate this brain connectivity graph [8, 9]. However, this task is complicated and in many cases the estimated graph may not be accurate. Our contribution is to develop a robust algorithm to determine the reduced dimensionality components that include taskrelated information, under the assumption that the brain imaging data is graphsmooth but the knowledge of the graph is imperfect. Specifically, our contributions are: (i) based on a model with a lowrank component plus a sparse perturbation, and an initial graph estimation, we propose an algorithm to simultaneously learn the lowrank component and the graph; (ii) we derive the learning steps using ADMM [10]; (iii) we evaluate the algorithm using synthetic and real brain imaging data in a supervised classification task.
1.1 Related Work
This work is inspired by [2]. While the focus of [2] is to learn the connectivity graph topology, their algorithm also estimates some noisefree version of the input data as byproduct. Gaussian noise model and Frobenius norm optimization are employed in [2]. Therefore, their work is suitable for problem when noise is small. In our work, starting from a model with a lowrank component plus a sparse perturbation, and an initial graph estimation, we adopt the idea of [2] to incrementally refine the underlying connectivity graph, thereby better lowrank estimation of the data can be obtained. As will be shown in our experiment, our method can perform better with highdimensional graph data grossly corrupted by complicated noise, such as brain imaging signals. In addition to [2], learning a graph from smooth signals has attracted a fair amount of interests recently [3, 4]. These works focus on learning the graph, and advanced formulations (e.g., matrix optimization problem) have been derived. Estimation of the brain connectivity graph using a Gaussian noise model has been proposed in [11]. On the other hand, focusing on lowrank estimation, some works have proposed to incorporate spectral graph regularization [12, 13, 14]. Their graphs are fixed in their algorithms, precomputed from the noisy input data. On the contrary, our algorithm uses the improved lowrank estimation to refine the graph, which in turn is used to improve the quality of the lowrank estimation. Besides, graph signal processing has been applied to a few different brain imaging tasks. In [15]
, graph Fourier transform is applied to decompose brain signals into low, medium, and high frequency components for analysis of functional brain networks properties.
[16]uses the eigenvectors of the graph Laplacian to approximate the intrinsic subspace of highdimensional brain imaging data. They experimented different brain connectivity estimations to compute the graph.
[17] presents a graph based framework for fMRI brain activation mapping. Graph signal processing has also been shown to be useful in image compression [18], temperature data [2], wireless sensor data [19]. A few signal features motivated by graph signal processing have also been proposed [20, 21]. Moreover, several linear / nonlinear dimensionality methods have been proposed that make use of the graph Laplacian of the sample affinity graphs [22, 23, 24] These methods are geometrically motivated, aim to preserve the local structures of the data, and involve different algorithms compared to our work.2 Simultaneous Low Rank and Graph Estimation
We consider , the high dimensional data matrix that consists of dimensional data points. For our brain imaging data, are the measurements by the sensors at the time instants, i.e., time series. We assume the data points have low intrinsic dimensionality and lie near some lowdimensional subspace. We assume the following mathematical model for the data:
(1) 
is the lowrank component of the data matrix that is of primary interest in this paper, and is a perturbation matrix. We assume that can have arbitrarily large magnitude but its support is sparse.
Principal component analysis (PCA) is the most popular technique for determining the lowrank component with application domains ranging from image, video, signal, web content, to network. The classical PCA finds the projection of in a dimensional () linear space characterized by an orthogonal basis , by solving the following optimization:
(2)  
subject to 
The and matrices are known as principal components and projected data points, respectively. is the approximation of the lowrank component. The classical PCA suffers from a few disadvantages. First, it is susceptible to grossly corrupted data in . Second, it does not consider the implicit data manifold information.
Candes et al. [25] addressed the first issue by designing Robust PCA
, which is robust to outliers by directly recovering the lowrank matrix
from the grossly corrupted :(3)  
subject to 
denotes the nuclear norm which is used as a convex surrogate of rank.
In this work we propose to extend (3) with an additional graph smoothness regularization, while the underlying graph topology that captures the data correlation could be unknown or inexact (thus some refinement is needed):
(4)  
subject to  
Here is the graph Laplacian of the feature graph describing the correlation between individual features: consists of a finite set of vertices , with , a set of edges , and a weighted adjacency matrix , with quantifying the similarity between the th and th features of the
dimensional measurement vectors.
, with being the diagonal degree matrix. is the set of all valid graph Laplacian :(5) 
As will be further discussed in Section 3, we solve (4) iteratively using alternating minimization with the following justifications:

given : For a given (even a rough estimate), imposes an additional constraint on the underlying (unknown) lowrank data . Specifically,
(6) where is the th row vector of . Therefore, in (4) forces the row and of to have similar values if is large. Note that in our brain imaging data, individual rows represent the time series captured by sensors. Thus, forces the lowrank representations of the time series to be similar for highly correlated sensors. Prior information regarding measurement correlation (such as the physical distance between the capturing sensors) can be incorporated as the initial to bootstrap the estimation of .

given : On the other hand, for a given estimate of the lowrank data , guides the refinement of and hence the underlying connectivity graph . In particular, a graph that is consistent with the signal variation in is favored: large if row and of have similar values. In many problems, the given graph for a problem can be noisy (e.g., the graph is estimated from the noisy data itself [13, 14]). The proposed formulation iteratively improves using the refined lowrank data. The improved in turn facilitates the lowrank data estimation.
3 Learning Algorithm
We propose to solve the problem in Eq (4) with alternating minimization scheme where, at each step, we fix one or two variables and update the other variable.
At the first step, for a given , we solve the following optimization problem using ADMM[10] with respect to and . It means given a graph, it estimates the low rank matrix:
(7)  
subject to 
At the second step, and are fixed and we solve the following optimization problem with respect to , which means that based on the low rank matrix we got in the previous step, it updates the graph.
(8)  
subject to 
For equation (8), it can be written as:
(9)  
subject to  
We can form the augmented Lagrangian of (9) as:
(10)  
Then we can get the following formula to update for , and :
(11)  
where is the Lagrangian parameter and is the Euclidean projection onto set .
4 Experiment
4.1 Synthetic Experiment
In this synthetic experiment, we generate lowrank, graphsmooth and grosslycorrupted data. We generate synthetic data with the following model:
(12) 
where is a low rank matrix with rank and is the sparse matrix.
We generate as a product where and . is also graphsmooth and generated as follows. The (groundtruth) graph consists of
nodes, with each pair of nodes having a probability of
to be connected together. The edge weights between different nodes are drew uniformly from 0 to 1 and presented in a symmetric adjacency matrix . We calculate the Laplacian matrix fromand compute the eigenvectors and eigenvalues of
. The eigenvectors, corresponding to top eigenvalues, are selected as the columns of . For matrix , the entries are independently sampled from a distribution. Therefore, is lowrank and graphsmooth. We introduce errors in the matrixfrom an i.i.d Bernoulli distribution. Each corrupted entry takes a value
with a probability .We compare the proposed method, GLSigRep[26], RPCAG[13], RPCA[25], and PCA on the data to estimate the low rank matrix and the graph matrix. The estimation accuracies are evaluated by the reconstruction errors: and . All the methods are initialized with the same feature similarity graph (consider each row of as a node) computed using the procedure in [14] with a Knearest neighbor strategy ().
Table 1 shows the results on synthetic data generated by the Eq (12) with . With crossvalidation, we set . The low rank approximation of proposed method achieves the smallest error. For the estimate graph matrix, the proposed method also achieves the smallest estimation error. Synthetic experiment results show that the proposed method can achieve good performance on extracting low rank approximation and the underlying graph from nonGaussian noisy data.
Methods  Low Rank Matrix  Graph Matrix 

Proposed Method  0.4278  0.3325 
GLSigRep  7.3263  1.4408 
RPCAG  0.4657   
RPCA  3.5883   
PCA  7.2942   
4.2 Brain Imaging Data Experiment
We also apply our proposed method on a highdimensional brain imaging dataset to extract the brain connectivity graph and the low rank approximation from the high dimensionality. This is practically useful for brain imaging studies: due to the high dimensionality of data, low signaltonoise ratio, and small number of available samples, it is challenging to estimate the low rank approximation in these studies.
The brain imaging dataset used here is a set of magnetoencephalography (MEG) signal recordings of brain activities, in response to two categories of visual stimuli: 320 face images and 192 nonface images. These images were randomly selected and displayed passively with no task, and 16 subjects were asked to simply fixate at the center of the screen. All images were normalized for size and brightness among other factors, and were each displayed once for 300 ms with random interstimulus delay intervals of 1000 ms to 1500 ms. We used a recumbent Elekta MEG scanner with 306 sensors to record the brain activity for 1100 milliseconds (100 milliseconds before and 1000 milliseconds after the presentation) for each stimuli. The classification task in this experiment is to distinguish signals evoked by face images from signals evoked by nonface images.
4.2.1 Initial graph matrix
In the proposed method, a suitable starting point is important for solving the optimization problem. We therefore initialize with the brain connectivity matrix generated with the resting state measurements. The resting state in our experiment is 100ms of signal recording before the stimuli presentation. Note that our method and all other methods are initialized with the same connectivity matrix.
Three different types of brain connectivity graphs are commonly used in the literature: structural connectivity, functional connectivity and effective connectivity. Structural connectivity shows the anatomical structure in the brain; functional connectivity quantifies functional dependencies between different brain regions; and effective connectivity shows directed or causal relationship between different brain regions [7].
In this paper we use a coherence connectivity, a functional connectivity, quantifying oscillatory interdependency between different brain regions [16]. It is the frequency domain analog of the crosscorrelation coefficient. Given two series of signals , and a frequency , the first step is to spectrally decompose the signal at target to obtain the instantaneous phase at each time point [27]. After bandpass filtering each signal between , the convolution of with a Morlet wavelet centered at frequency provides the instantaneous phase at time . Thus, the two signals can be represented as: and , where and are amplitudes. and are the phase for and at time t. Then the coherence connectivity edge is calculated as below:
(13) 
After we get the adjacency matrix , we can calculate the Laplacian matrix as the initialization.
4.2.2 Brain Imaging Data Experiment Results
To evaluate the performance of proposed method, we use a supervised classifier, SVM, to classify the lowrank outputs into face and nonface classes. We compare these methods based on their classification accuracies as well as compatibility of their connectivity graph matrix to the related neuroscience findings on suggested cortical regions involving face processing.
MEG and EEG components corresponding to the face/nonface distinction have been reported at latencies of approximately 100 ms, and more reliably at 170 ms (also known as N170 marker, reported at about 145ms in MEG studies), after visual stimulus onset (e.g. see [28, 29, 30, 31]). In this experiment, we therefore choose the data from two time slots, namely 96ms to 105ms and 141ms to 150ms after the stimuli presentation, to be able to compare our automate assessment with the related neuroscience literature.
We applied commonly employed mean subtraction on the timelocked data. In addition to this step, we refrained from providing any prior information to our method. This is due to the goal of this paper to showcase the capabilities of our proposed method, and therefore chose a pure data driven approach to the problem of estimation of underlying connectivity graph. For example, we did not enforce constraints in the optimization step on the sensor sensitivity to field spread (i.e. nearby electrodes record similar brain activities). Similarly, we did not differentiate between the magnetometers and gradiometers. With no prior information fed to the method alongside the data, the estimated underlying graph can be easily compared with the results of other methods. This is while more specific experiments can be conducted in the future to reveal the effects of prior information on the results.
Given the input data and the initial graph, our proposed method outputs some low rank estimation . We decompose the low rank matrix using SVD, select the components according to the rank of , project the data onto the components to obtain lowdimensionality representations, and classify the reduced dimension data. We compare the proposed method with GLSigRep, RPCAG, RPCA and PCA. Using crossvalidation with , , (for step 2, only the ratio of and matters the results), we set for the first time slot and for the second time slot.Table 2 shows the classification results for different methods on the two time slots. The proposed method gives the best results for both time slots.
For another comparison, in Figure 1, we visualize the estimated graph matrix by our method as well as the GLSigRep method, by projecting the graph connectivity weights on the MEG sensor locations. The first row of Figure 1 shows the initialization graph used for both method, the coherence connectivity graph obtained from the resting state data. The second row visualizes the output of the two methods for data at 100ms and 145ms time point.
Comparing the graph visualization results from GLSigRep (Figure 1, b and d), one can see that using the data from 96105ms, GLSigRep indicates connectivities at the left temporal and middle and inferior frontal gyri. The estimated graph by GLSigRep does not significantly change using the data from 141150ms either. None of these regions highlighted by GLSigRep has apparent link with early visual processing described in neuroscientific literature (note that GLSigRep assumes Gaussian noise and would not be appropriate for MEG data). At 100ms after projection of a visual stimuli like a face image, neuroscientific literature seem to report the visual information is still being processed at early visual cortex at occipital and occipitotemporal regions (e.g. see [29, 31]). Unlike GLSigRep results, at 96105ms, the graph connectivity estimation by our method (Figure 1, a) tends to span more on the occipital and left occipitotemporal regions, and therefore seem to have reached a more successful estimation of the true underlying connectivity at this timepoint.
The connectivity graph estimated by our method during 141ms to 150ms (Figure 1, c) converges on connections on the right occipitotemporal region. This graph connectivity is comparable to the neuroscientific findings on face perception, and specifically the N170 marker. In several studies such as [28, 30], the fusiform gyrus (at the occipitotemporal region) are suggested for processing face perception during about 145ms after presentation of a face image stimuli, also known as N170 marker (named after its first discovery at 170ms in EEG studies). In this work, our technique reveals almost the same regions as important graph connections for face perception. The compatibility of our estimated graph connectivity with neuroscientific literature further supports our proposed method over others.
Time slot  96ms105ms  141ms150ms  

Methods  SVM  rank  SVM  rank 
Proposed Method  66.65%  21  81.91%  25 
GLSigRep  64.97%  21  78.93%  25 
RPCAG  64.21%  23  81.15%  29 
RPCA  62.89%  32  80.15%  33 
PCA  64.22%  32  78.81%  33 
(a)  (b)  (c)  (d) 
5 Conclusion
We propose an algorithm in learning the low rank component and graph simultaneously. It is suitable for cases where the perturbations on the low rank components are grossly but sparse. We showed that the proposed method on both synthetic data and brain imaging data is competitive. Our method achieves good performance on both low rank approximation and graph estimation. In addition, when applying to the brain imaging data, our method could recover a connectivity graph that is more compatible to the neuroscientific literature, indicating its better estimation of the true underlying graph. Future work applies the proposed algorithm for other highdimensional data [32].
References
 [1] David Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, Pierre Vandergheynst, et al., “The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains,” Signal Processing Magazine, IEEE, vol. 30, no. 3, pp. 83–98, 2013.
 [2] Xiaowen Dong, Dorina Thanou, Pascal Frossard, and Pierre Vandergheynst, “Laplacian matrix learning for smooth graph signal representation,” in Proc. ICASSP, 2015.
 [3] Eduardo Pavez and Antonio Ortega, “Generalized laplacian precision matrix estimation for graph signal processing,” in Proc. ICASSP, 2016.
 [4] Vassilis Kalofolias, “How to learn a graph from smooth signals,” in AISTATS 2016, 2016.
 [5] Kleovoulos Tsourides, Shahriar Shariat, Hossein Nejati, Tapan K Gandhi, Annie Cardinaux, Christopher T Simons, NgaiMan Cheung, Vladimir Pavlovic, and Pawan Sinha, “Neural correlates of the food/nonfood visual distinction,” Biological Psychology, 2016.
 [6] H. Nejati, K. Tsourides, V. Pomponiu, E.C. Ehrenberg, NgaiMan Cheung, and P. Sinha, “Towards perception awareness: Perceptual event detection for brain computer interfaces,” in EMBC2015, Aug 2015, pp. 1480–1483.
 [7] Ed Bullmore and Olaf Sporns, “Complex brain networks: graph theoretical analysis of structural and functional systems,” Nature Reviews Neuroscience, vol. 10, no. 3, pp. 186–198, 2009.
 [8] James S Hyde and Andrzej Jesmanowicz, “Crosscorrelation: an fmri signalprocessing strategy,” NeuroImage, vol. 62, no. 2, pp. 848–851, 2012.
 [9] Andrea Brovelli, Mingzhou Ding, Anders Ledberg, Yonghong Chen, Richard Nakamura, and Steven L Bressler, “Beta oscillations in a largescale sensorimotor cortical network: directional influences revealed by granger causality,” Proceedings of the National Academy of Sciences of the United States of America, vol. 101, no. 26, pp. 9849–9854, 2004.

[10]
Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein,
“Distributed optimization and statistical learning via the
alternating direction method of multipliers,”
Foundations and Trends in Machine Learning
, vol. 3, no. 1, pp. 1–122, 2011.  [11] Chenhui Hu, Lin Cheng, Jorge Sepulcre, Georges El Fakhri, Yue M. Lu, and Quanzheng Li, “A graph theoretical regression model for brain connectivity learning of alzheimer’s disease,” in ISBI2013, San Francisco, CA, 711 Apr. 2013.
 [12] Bo Jiang, Chris Ding, Bin Luo, and Jin Tang, “GraphLaplacian PCA: Closedform solution and robustness,” in CVPR, 2013, pp. 3490–34968.

[13]
Nauman Shahid, Vassilis Kalofolias, Xavier Bresson, Michael Bronstein, and
Pierre Vandergheynst,
“Robust principal component analysis on graphs,”
in
Proceedings of International Conference on Computer Vision
, Santiago, Chile, 2015, pp. 2812–2820.  [14] Nauman Shahid, Nathanael Perraudin, Vassilis Kalofolias, Gilles Puy, and Pierre Vandergheynst, “Fast robust PCA on graphs,” arXiv:1507.08173v2 [cs.CV] 25 Jan 2016, pp. 1–17, 2016.
 [15] Weiyu Huang, Leah Goldsberry, Nicholas F Wymbs, Scott T Grafton, Danielle S Bassett, and Alejandro Ribeiro, “Graph frequency analysis of brain signals,” arXiv preprint arXiv:1512.00037v2, 2016.
 [16] Liu Rui, Hossein Nejati, and NgaiMan Cheung, “Dimensionality reduction of brain imaging data using graph signal processing,” in ICIP. IEEE, 2016, pp. 1329–1333.
 [17] Hamid Behjat, Nora Leonardi, Leif Sörnmo, and Dimitri Van De Ville, “Anatomicallyadapted graph wavelets for improved grouplevel fmri activation mapping,” NeuroImage, vol. 123, pp. 185–199, 2015.
 [18] Wei Hu, Gene Cheung, Antonio Ortega, and Oscar C Au, “Multiresolution graph fourier transform for compression of piecewise smooth images,” Image Processing, IEEE Transactions on, vol. 24, no. 1, pp. 419–433, 2015.

[19]
Hilmi E Egilmez and Antonio Ortega,
“Spectral anomaly detection using graphbased filtering for wireless sensor networks,”
in ICASSP. IEEE, 2014, pp. 1085–1089.  [20] Xiaowen Dong, Antonio Ortega, Pascal Frossard, and Pierre Vandergheynst, “Inference of mobility patterns via spectral graph wavelets,” in ICASSP. IEEE, 2013, pp. 3118–3122.

[21]
Jieqi Kang, Shan Lu, Weibo Gong, and Patrick A Kelly,
“A complex network based feature extraction for image retrieval,”
in ICIP. IEEE, 2014, pp. 2051–2055.  [22] M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation, vol. 15, no. 6, pp. 1373–1396, 2003.
 [23] Xiaofei He and Partha Niyogi, “Locality preserving projection,” in Proceedings of NIPS, 2003.
 [24] Shuicheng Yan, Dong Xu, Benyu Zhang, and Stephen Lin, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.
 [25] Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright, “Robust principal component analysis?,” Journal of the ACM, vol. 58, no. 3, pp. 11:1–11:37, May 2011.
 [26] Xiaowen Dong, Dorina Thanou, Pascal Frossard, and Pierre Vandergheynst, “Learning laplacian matrix in smooth graph signal representations,” arXiv preprint arXiv:1406.7842v3, 2014.
 [27] Benjamin T Schmidt, Avniel S Ghuman, and Theodore J Huppert, “Whole brain functional connectivity using phase locking measures of resting state magnetoencephalography,” Front. Neurosci, vol. 8, no. 141, pp. 10–3389, 2014.

[28]
David I. Perrett, Amanda J. Mistlin, and Andrew J. Chitty,
“Visual neurones responsive to faces,”
Trends in Neurosciences, vol. 10, no. 9, pp. 358 – 364, 1987.  [29] S. Thorpe, D. Fize, and C. Marlot, “Speed of processing in the human visual system,” Nature, vol. 381, no. 6582, pp. 520–2, 1996.
 [30] Jia Liu, Alison Harris, and Nancy Kanwisher, “Stages of processing in face perception: an MEG study.,” Nature neuroscience, vol. 5, no. 9, pp. 910–916, Sept. 2002.
 [31] R. Desimone, “Faceselective cells in the temporal cortex of monkeys,” Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 1–8, Jan. 2006.
 [32] Lu Fang, NgaiMan Cheung, D Tian, A Vetro, H Sun, and O Au, “An analytical model for synthesis distortion estimation in 3d video,” IEEE Transactions on Image Processing, vol. 23, no. 1, pp. 185–199, 2014.
Comments
There are no comments yet.