The iterative reweighted Mixed-Norm Estimate for spatio-temporal MEG/EEG source reconstruction

by   Daniel Strohmeier, et al.
Télécom ParisTech

Source imaging based on magnetoencephalography (MEG) and electroencephalography (EEG) allows for the non-invasive analysis of brain activity with high temporal and good spatial resolution. As the bioelectromagnetic inverse problem is ill-posed, constraints are required. For the analysis of evoked brain activity, spatial sparsity of the neuronal activation is a common assumption. It is often taken into account using convex constraints based on the l1-norm. The resulting source estimates are however biased in amplitude and often suboptimal in terms of source selection due to high correlations in the forward model. In this work, we demonstrate that an inverse solver based on a block-separable penalty with a Frobenius norm per block and a l0.5-quasinorm over blocks addresses both of these issues. For solving the resulting non-convex optimization problem, we propose the iterative reweighted Mixed Norm Estimate (irMxNE), an optimization scheme based on iterative reweighted convex surrogate optimization problems, which are solved efficiently using a block coordinate descent scheme and an active set strategy. We compare the proposed sparse imaging method to the dSPM and the RAP-MUSIC approach based on two MEG data sets. We provide empirical evidence based on simulations and analysis of MEG data that the proposed method improves on the standard Mixed Norm Estimate (MxNE) in terms of amplitude bias, support recovery, and stability.



There are no comments yet.


page 8

page 10


Sparse and Smooth Signal Estimation: Convexification of L0 Formulations

Signal estimation problems with smoothness and sparsity priors can be na...

Data-driven cortical clustering to provide a family of plausible solutions to M/EEG inverse problem

The M/EEG inverse problem is ill-posed. Thus additional hypotheses are n...

Improving EEG Source Localization through Spatio-temporal Sparse Bayesian Learning

Sparse Bayesian Learning (SBL) approaches to the EEG inverse problem suc...

Non-convex optimization via strongly convex majoirziation-minimization

In this paper, we introduce a class of nonsmooth nonconvex least square ...

Electromagnetic Brain Imaging using Sparse Bayesian Learning – Noise Learning and Model Selection

Brain source reconstruction from electro- or magnetoencephalographic (EE...

Group level MEG/EEG source imaging via optimal transport: minimum Wasserstein estimates

Magnetoencephalography (MEG) and electroencephalogra-phy (EEG) are non-i...

Contextual Minimum-Norm Estimates (CMNE): A Deep Learning Method for Source Estimation in Neuronal Networks

Magnetoencephalography (MEG) and Electroencephalography (EEG) source est...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Source imaging with magnetoencephalography (MEG) and electroencephalography (EEG) delivers insights into the active brain with high temporal and good spatial resolution in a non-invasive way [baillet-etal:2001]. It is based on solving the bioelectromagnetic inverse problem, which is a high dimensional ill-posed regression problem. In order to render its solution unique, constraints have to be imposed reflecting a priori assumptions on the neuronal sources. In the past, several source reconstruction techniques have been proposed, which are based on the assumption that only a few focal brain regions are involved in a specific cognitive task. Inverse methods favoring sparse focal source configurations to explain the MEG/EEG signals include parametric [scherg-etal:85], scanning [mosher-etal:1992, mosher-leahy:1998, mosher-leahy:1999, xu-etal:2004], and imaging approaches [Matsuura-Okabe:1995, Ou-etal:2009, gramfort-etal:2012, Lucka-etal:2012, Wipf-Nagarajan:2009, Sorrentino-etal:2009]. These techniques, which are partly used in clinical routine, are suitable e.g. for analyzing evoked responses or epileptic spike activity. Classic MEG/EEG source imaging technique using sparsity-inducing penalties are the Selective Minimum Norm Method [Matsuura-Okabe:1995] or Minimum Current Estimate (MCE) [Uutela-etal:1999]. Both approaches are based on the Lasso [Tibshirani:1994], i.e., regularized regression with an -norm penalty, which is a convex surrogate for the optimal, but NP hard -norm regularized regression problem. To reduce the sensitivity to noise and avoid discontinuous, scattered source activations [Ou-etal:2009], mixed norms such as the -mixed-norm used in Group Lasso [Yuan06GroupLasso] or Group Basis Pursuit[liao-etal:2014] can be applied. The idea is to take the spatio-temporal characteristics of neuronal activity into account by imposing structured sparsity in space or time [haufe-etal:2008, bolstad-etal:2009, Ou-etal:2009, gramfort-etal:2012]. We refer to [huang-etal:2012] for a general review on group selection in high-dimensional models. A prominent example is the Mixed-Norm Estimate (MxNE) proposed in [gramfort-etal:2012]

, which extends the MCE to multiple measurement vector problems by applying a block-separable convex penalty. Each block represents the source activation over time of a dipole with free orientation at a specific source location. Spatial sparsity is promoted by an

-norm penalty over blocks, whereas a Frobenius norm per block promotes stationary source estimates, i.e., a source with a non-zero amplitude at one time instant has a non-zero amplitude during the full time window of interest [gramfort-etal:2012, gramfort-etal:2013]. The Frobenius norm also prevents the orientations of the free orientation dipoles from being biased towards the coordinate axes [chang-etal:2013]. These convex approaches allow for fast algorithms with guaranteed global convergence. However, the resulting source estimates are biased in amplitude and often suboptimal in terms of support recovery [Candes], which is impaired by the high spatial correlation of the MEG/EEG forward model. As shown e.g. in the field of compressed sensing, promoting sparsity by applying non-convex penalties, such as logarithmic or -quasinorm penalties with

, improves support reconstruction in terms of feature selection, amplitude bias, and stability 

[Candes, chartrand:2007a, saab-etal:2008]. Several approaches for solving the resulting non-convex optimization problem have been proposed including generalized shrinkage[woodworth-chartrand:2015], iterative reweighted [Candes, Gasso, Rakotomamonjy, zhang-rao:2011], or iterative reweighted optimization [gorodnitsky-rao:1997, rao-etal:1999, chartrand-yin:2008, cotter-etal:2005, zhang-rao:2011c]. See [wipf-nagarajan:2010, Rakotomamonjy] for a review of these approaches for single and multiple measurement vector problems. Several MEG/EEG sparse source imaging techniques based on iterative reweighted optimization have been proposed [gorodnitsky-etal:1995, gorodnitsky-rao:1997, portniaguine-zhdanov:1999, nagarajan-etal:2006, liu-etal:2005]. An iterative reweighted optimization technique for EEG source imaging was proposed in [xu-etal:2007], which however does not impose structured sparsity and applies a fixed orientation constraint [Lin-etal:2006]. In this paper, we propose the iterative reweighted Mixed-Norm Estimate (irMxNE), a novel MEG/EEG sparse source imaging approach based on the framework of iterative reweighted , which promotes structured sparsity to improve MEG/EEG source reconstruction. A preliminary version of this method was presented in [Strohmeier-etal:2014]. Similar approaches have recently been proposed in other fields of research [wipf-nagarajan:2010, zhang-rao:2011b]. The irMxNE is based on a non-convex block-separable penalty, which combines a Frobenius norm per block and an -quasinorm over blocks. The non-convex objective function is minimized iteratively by computing a sequence of weighted MxNE problems. For solving the convex surrogate problems, we propose a new computationally efficient strategy, which combines block coordinate descent [tseng, Rakotomamonjy, BCD_MMV] and a forward active set strategy with convergence controlled by means of the duality gap, which converges significantly faster than the original MxNE algorithm proposed in [gramfort-etal:2012]. We provide information on the integration of different source orientation constraints [Lin-etal:2006] and discuss specific problems of MEG/EEG source imaging such as depth bias compensation and amplitude bias correction. We present empirical evidence using simulations and analysis of two experimental MEG data sets that the proposed method outperforms MCE and MxNE in terms of amplitude bias, active source identification, and stability. Finally, we compare the proposed approach with the dSPM [Dale-etal:2000] and RAP-MUSIC method [mosher-leahy:1999] based on two MEG data sets.

Notation: We mark vectors with bold letters, , and matrices with capital bold letters, . The transpose of a vector or matrix is denoted by and . The scalar is the i element of . corresponds to the i row and to the j column of . indicates the Frobenius norm, and the spectral norm of a matrix.

Ii Materials and Methods

Ii-a The inverse problem

The MEG/EEG forward problem describes the linear relationship between the MEG/EEG measurements ( number of sensors, number of time instants) and the source activation ( number of source locations, number of orthogonal dipoles per source location with if source orientation is postulated, e.g. using the cortical constraint [Dale:1993], and typically otherwise). The model then reads:


where is the gain or leadfield matrix, a known instantaneous mixing matrix, which links source and sensor signals. is the measurement noise, which is assumed to be additive, white, and Gaussian, for all . This assumption is acceptable on the basis of a proper spatial whitening of the data using an estimate of the noise covariance [engemann-etal:14]. As , the MEG/EEG inverse problem is ill-posed and constraints have to be imposed on the source activation matrix to render the solution unique. By partitioning into S blocks , where each represents the source activation at a specific source location over time and across orthogonal current dipoles, we can apply a penalty term promoting block sparsity by combining a Frobenius norm per block and a -quasinorm penalty over blocks. The optimization problem reads:


where is the regularization parameter balancing the data fit and penalty term. Similar to the constraint applied in MxNE [gramfort-etal:2012], promotes source estimates with only a few focal sources that have non-zero activations during the entire time interval of interest. The Frobenius norm per block , which combines -norm penalties over time and orientation as proposed in [Uutela-etal:1999, Ou-etal:2009, gramfort-etal:2013], imposes stationarity of the source estimate and prevents the source orientations from being biased towards the coordinate axes [chang-etal:2013]. The -quasinorm penalty promotes spatial sparsity.

Ii-B Iterative reweighted Mixed Norm Estimate

The proposed block-separable regularization functional is an extension of the -quasinorm penalty with used in [Gasso, Candes, Rakotomamonjy, cotter-etal:2005]. These works showed, based on the framework of Difference of Convex functions programming or Majorization-Minimization algorithms, that the resulting non-convex optimization problem can be solved by iteratively solving a sequence of weighted convex surrogate optimization problems with weights being defined based on the previous estimate. The convex surrogate problem is obtained by replacing the non-decreasing concave function with a convex upper bound using a local linear approximation at the current estimate. By solving this sequence of surrogate problems, the value of the non-convex objective function decreases, but without guarantee for convergence to a global minimum. The cost function in Eq. (2) can thus be minimized by computing the sequence of convex problems given in Eq. (3). The weights for the k iteration are obtained from the previous source estimate . Intuitively, sources with high amplitudes in the (k-1) iteration will be less penalized in the k iteration and therefore further promoted.


As each iteration is equivalent to solving a weighted MxNE problem, we call this optimization scheme the iterative reweighted MxNE (irMxNE). Due to the non-convexity of the optimization problem in Eq. (2), the procedure is sensitive to the initialization of . In this paper, we use for all as proposed in [Gasso]. Consequently, the first iteration of irMxNE is equivalent to solving a standard MxNE problem. As each iteration of the iterative scheme in Eq. (3) solves a convex problem with guaranteed global convergence, the initialization of has no influence on the final solution. can thus be chosen arbitrarily and we use warm starts for improving the computation time. For sources with , Eq. (3) has an infinite regularization term. Typically, a smoothing parameter is added to avoid weights to become zero [Gasso, Candes, chartrand-yin:2008]. Here, we reformulate the weighted MxNE subproblem and apply the weights without epsilon smoothing by scaling the gain matrix with a weighting matrix as given in Eq. (4). After convergence, we reapply the scaling to to obtain the final estimate .


with being a diagonal matrix, which is computed according to Eq. (5):


where is a vector of ones and is the Kronecker product. In each MxNE iteration, we restrict the source space to source locations with to reduce the computation time.

We control the global convergence of each weighted MxNE subproblems in Eq. (4) by monitoring the duality gap. For details on convex duality in the context of optimization with sparsity-inducing penalties, we refer to [bach-etal:2012]. In the following, we summarize the rationale for this stopping criterion. For a general minimization problem, the minimum of the primal objective function is bounded below by the maximum of the associated dual objective function , i.e., , where and are the optimal solutions of the primal and dual problem. The duality gap , where and are the current values of the primal and dual variable, is thus non-negative and provides an upper bound on the difference between and . If strong duality holds, the duality gap at the optimum is zero. To use this stopping criterion in practice, we need to derive the dual problem and choose a good feasible dual variable given a value of , which allows for at the optimum.

Due to Slater’s conditions [Boyd_Vandenberghe04], strong duality holds for the MxNE subproblem and we can check convergence of an iterative optimization scheme solving Eq. (4) by computing the current duality gap . Based on the Fenchel-Rockafellar duality theorem [rockafellar:1997], the dual objective function associated to the primal objective function

is given in Eq. (6). For a detailed derivation, we refer to [gramfort-etal:2012].


where indicates the trace of a square matrix, and the Fenchel conjugate of , which is the indicator function of the associated dual norm. As shown in [gramfort-etal:2012], a natural mapping from the primal to the dual space is given by a scaling of the residual . This is motivated by the fact that the solution of the dual problem at the optimum is proportional to the residual, which follows from the associated KKT conditions [gramfort-etal:2012]. The scaling is done according to Eq. (7) such that the dual variable satisfies the constraint of .


In practice, we terminate the iterative optimization scheme for solving MxNE, when the estimate at the i inner iteration is -optimal with , i.e., . According to [gramfort-etal:2012], this is a conservative choice provided that the data is scaled by spatial pre-whitening.

For solving the weighted MxNE subproblems, we propose a block coordinate descent (BCD) scheme [tseng], which, for the problem at hand, converges faster than the Fast Iterative Shrinkage-Thresholding algorithm (FISTA) proposed earlier in [gramfort-etal:2012] (cf. section III-B). A BCD scheme for solving the Group LASSO was proposed in [Rakotomamonjy, BCD_MMV]. The subproblem per block has a closed form solution, which involves applying the group soft-thresholding operator, the proximity operator associated to the -mixed-norm [gramfort-etal:2012]. Accordingly, the closed form solution for the BCD subproblems solving the MxNE problem can be derived, which is given in Eq. (8).


The step length for each BCD subproblem is determined by with being the Lipschitz constant of the data-fit restricted to the s source location. This step length is typically larger than the step length applicable in iterative proximal gradient methods, which is upper-bounded by the inverse of . Pseudo code for the BCD scheme is shown in Algorithm 1.

0:  , , , , , , and .
1:  Initialization:
2:  while  do
3:     for  to  do
4:         Solve Eq. (8) with , , and
5:     end for
7:  end while
Algorithm 1 MxNE with BCD

The BCD scheme is typically applied using a cyclic sweep pattern, i.e., all blocks are updated in a cyclic order in each iteration. However, as the penalty term in Eq. (2) promotes spatial sparsity, most of the blocks of are zero. We can thus reduce the computation time by primarily updating blocks, that are likely to be non-zero, while keeping the remaining blocks at zero. For this purpose, data-dependent sweep patterns (such as greedy approaches based on steepest descent [li-osher:2009, wei-etal:2012]) or active set strategies [friedman-etal:2010, roth-etal:08] can be applied. In this paper, we combine BCD with a forward active set strategy proposed in [roth-etal:08, gramfort-etal:2012]. Pseudo code for the proposed MxNE solver is provided in Algorithm 2.

0:  , , , , and .
1:  Initialization: , ,
2:  for  to  do
4:  end for
5:  while  do
8:     Define and by restricting and to
9:      Solve Algorithm 1 with , and
12:  end while
Algorithm 2 MxNE with BCD and active set strategy

We start by estimating an initial active set of sources by evaluating the Karush-Kuhn-Tucker (KKT) optimality conditions, which state that if [gramfort-etal:2012]. We select the N sources as the initial active set, which violate this condition the most (we use in practice). Subsequently, we restrict the source space to the sources in and estimate by solving Eq. (4) with convergence controlled by the duality gap. After convergence of this restricted optimization problem, we check whether is also an -optimal solution for the original optimization problem (without restricting the source space to ) by computing the corresponding duality gap assuming that all sources, which are not part of the active set, have zero activation. If is not an -optimal solution indicated by , we re-evaluate the KKT optimality conditions and update the active set by adding the N sources with the highest score. We then repeat the procedure with warm start.

We terminate irMxNE when with a user specified threshold , which we set to in practice. The proposed optimization algorithm for irMxNE is fast enough to allow its usage in practical MEG/EEG applications. Full pseudo code for irMxNE is provided in Algorithm 3.

0:  , , , , , and .
1:  Initialization: ,
2:  for  to  do
4:      Solve Algorithm 2 with and
6:     if  then
7:        break
8:     end if
9:      Solve Eq. 5 with
10:  end for
Algorithm 3 Iterative reweighted MxNE

Ii-C Source constraints and bias

Ii-C1 Source orientation

The proposed BCD scheme is applicable for MEG/EEG inverse problems without and with orientation constraint. For imposing a loose orientation constraint [Lin-etal:2006], we apply a weighting matrix to each block of the gain matrix with corresponding to the dipole orientated normally to the cortical surface, and and to the two tangential dipoles. The weighting parameter controls up to which angle the rotating dipole may deviate from the normal direction [Lin-etal:2006, gramfort-etal:2013]. The orientation-weighted gain matrix is hence defined as , where

is the identity matrix. Since the penalty in Eq. (

8) does not promote sparsity along orientations, the irMxNE result is not biased towards the coordinate axes [chang-etal:2013]. When the source orientation is postulated a priori (e.g. normal to the cortical surface), each block corresponds to the activation of a fixed dipole. Consequently, the Frobenius norm per block can be replaced by the -norm of the source activation of the corresponding fixed dipole.

Ii-C2 Depth bias compensation

Due to the attenuation of the bioelectromagnetic field with increasing distance between source and sensor, deep sources require higher source amplitudes to generate sensor signals of equal strength compared to superficial sources. Consequently, inverse methods, which are based on constraints penalizing the source amplitudes, have a bias towards superficial sources. In order to compensate this bias, each block of the gain matrix is weighted a priori. Here, we apply the depth bias compensation proposed in [huang-etal:2014], which computes the weights used for depth bias compensation based on the SVD of the gain matrix.

Ii-C3 Amplitude bias compensation

Source activation estimated with source reconstruction approaches based on -quasinorms with , such as MxNE and irMxNE, show a varying degree of amplitude bias due to the inherent shrinkage. The standard practice for compensating the amplitude bias consists in computing the least squares fit after restricting the source space to the support of , which is typically an over-determined optimization problem. In contrast, we apply the debiasing approach proposed in [gramfort-etal:2013], which preserves the source characteristics and orientations estimated with irMxNE by estimating a scaling factor for each source, which is constrained to be above 1 and constant over orientation and time. The bias corrected source estimate is computed using as , where the diagonal scaling matrix is estimated based on the convex problem:

Ii-D Simulation setup

We compare MCE, MxNE and irMxNE in terms of amplitude bias, support recovery, and stability using simulated auditory evoked fields. The simulation, which was repeated 100 times, is based on a real gain matrix computed with a three-shell boundary element model using 4699 cortical sources with fixed orientation (normal to the cortical surface), and a 306-channels Elekta Neuromag Vectorview system (Elekta Neuromag Oy, Helsinki, Finland) with 102 magnetometers and 204 gradiometers. The sampling rate was set to 1 kHz and we restricted the analysis to the time window from 60 ms to 150 ms. We generated single trials by activating two dipolar sources, one in each transverse temporal gyrus, with Gaussian functions peaking at 100 ms and 110 ms with a peak amplitude of 55 nAm and 45 nAm,

. Background activity was generated by ten dipolar sources placed randomly on the cortical surface. Each dipole was activated with filtered white noise with a peak amplitude of 100 nAm. The filter coefficients were determined by fitting an auto-regressive process of order 5 to real baseline MEG data 

[gramfort-etal:2013]. By averaging 100 single trials, the SNR of the evoked response, which we compute using spatial whitened data as , was set to . Source reconstruction was computed without orientation constraint, where none of the dipoles used to generate the gain matrix was oriented perpendicularly to the cortical surface. All methods were applied with different regularization parameters given as a percentage of the respective , which is the smallest regularization parameter leading to an empty active set [gramfort-etal:2012]. We evaluate the source reconstruction performance by means of the true and false positives counts. We consider a source to be a true positive, if its geodesic distance along the cortical surface from the true source location is less than 1 cm. A value of 1 cm is what would be considered an acceptable localization error for most neuroscience applications. Moreover, we present the active set size and the root mean square error in the sensor space, RMSE = . To evaluate the stability of the reconstructed support, we compute Krippendorff’s alpha [hayes-krippendorff:2007].

Ii-E Experimental MEG data

We evaluate the performance of MxNE and irMxNE using data from the MIND multi-site MEG study [aine-etal:2012, weisend-etal:2007, Ou-etal:2007]. We use two different data sets from one exemplary subject, auditory evoked fields (AEF) and somatosensory evoked fields (SEF), recorded using the 306-channels Elekta Neuromag Vectorview system. A detailed description of the data and paradigms can be found in [aine-etal:2012, weisend-etal:2007, Ou-etal:2007]. For the AEF data set, we report results for AEFs evoked by left auditory stimulation with pure tones of 500 Hz. The analysis window for source estimation was chosen from 50 ms to 200 ms based on visual inspection of the evoked data to capture the dominant N100m component. For the SEF data set, we analyzed SEFs evoked by bipolar electrical stimulation (0.2 ms in duration) of the left median nerve. To capture the main peaks of the evoked response and to exclude the strong stimulus artifact, the analysis window was chosen from 18 ms to 200 ms based on visual inspection. Following the standard pipeline from the MNE software [mne]

, signal preprocessing for both data sets consisted of signal-space projection for suppressing environmental noise, and baseline correction using pre-stimulus data (from -200 ms to -20 ms). Epochs with peak-to-peak amplitudes exceeding predefined rejection parameters (3 pT for magnetometers, 400 pT/m for gradiometers, and 150 μV for EOG) were assumed to be affected by artifacts and discarded. This resulted in 96 (AEF) and 294 (SEF) artifact-free epochs, which were resampled to 500 Hz. The gain matrix was computed using a set of 7498 cortical locations, and a three-layer boundary element model. The stability of the source reconstruction was tested using a resampling technique. For each data set, we generated 100 random sets of epochs by randomly selecting 80% of all available epochs without replacement. The noise covariance matrix for spatial whitening was estimated for each subsample using pre-stimulus data (from -200 ms to -20 ms). We applied both MxNE and irMxNE on the average of each random set without orientation constraint. Due to the lack of a ground truth, the source reconstruction performance is evaluated by means of the Goodness of Fit (GOF) and the active set size. To compare results with well established source reconstruction techniques, we compute the dSPM solution 

[Dale-etal:2000] without orientation constraint and the RAP-MUSIC estimate [mosher-leahy:1999] for both data sets using the MNE-Python software [mne-python]. For RAP-MUSIC, we use single-dipole and two-dipole independent topographies to address the problem of correlated sources [mosher-leahy:1998]. A similar idea is pursued e.g. by dual core beamformers [diwakar-etal:2011]. The correlation threshold was set to 0.95 as proposed by Mosher et al. [mosher-leahy:1998]

. The rank of the signal subspace was determined by thresholding the eigenvalues of the data covariance based on an estimate of the noise variance. As we apply a spatial whitening, the threshold was set to 1.

Iii Results

Iii-a Simulation study

The results of the simulation study (100 repetitions) for different regularization parameters (from 5 to 100 % of ) are presented in Fig. 1.

Fig. 1: Results of the simulation study based on simulated AEFs for MCE, MxNE and irMxNE. The source space contained a total of 4699 sources and the simulation was repeated 100 times: (a) mean true positive count (left A1), (b) mean true positive count (right A1), (c) mean false positive count, (d) mean active set size, (e) mean RMSE without (solid) and with (dashed) debiasing, and (f) Krippendorff’s .

The source space contained 4699 sources and one source per hemisphere was active indicated by the horizontal dashed lines in Fig. 1a and b. True positive counts above this threshold indicate suboptimally sparse source estimates, whereas counts close to zero indicate false negatives. We can see that the irMxNE approach provides the best support recovery. It allows to reconstructs single dipoles in both regions of interest, whereas MCE and MxNE find multiple correlated sources. Particularly for low values of , MCE and MxNE overestimate the size of the active set leading to a large number of false positives, whereas irMxNE generates significantly less false positives. The mean active set size confirms that irMxNE provides the sparsest result of all three methods. The mean RMSE is shown in Fig. 1e. While all methods profit from the debiasing procedure, the effect on irMxNE is less pronounced compared to the other methods indicating a reduced amplitude bias. The best result is obtained with irMxNE. Krippendorff’s indicates that the support reconstructed with irMxNE is more stable compared to MCE or MxNE. The source estimate is thus less dependent on the epochs used for generating the evoked response.

Iii-B Experimental MEG data

Iii-B1 Auditory evoked fields

We first compare the performance of the proposed BCD scheme for solving the weighted MxNE with the Fast Iterative Shrinkage Thresholding Algorithm (FISTA) [Beck09Fista], a proximal gradient method used in [gramfort-etal:2012]. Both methods were applied with and without active set strategy. All computations were performed on a computer with a 2.4 GHz Intel Core 2 Duo processor and 8 GB RAM. The computation times as a function of are presented in Fig. 2. The BCD scheme outperforms FISTA both with and without active set strategy. Combining the BCD scheme and the active set strategy reduces the computation time by a factor of 100 and allows to compute the MxNE on real MEG/EEG data in a few seconds. Since subsequent MxNE iterations are significantly faster due to the restriction of the source space, irMxNE also runs in a few seconds on real MEG/EEG source localization problems.

Fig. 2: Computation time as a function of for MxNE on real MEG data (free orientation) using BCD and FISTA with (solid) and without (dashed) active set strategy.

We applied MxNE and irMxNE (with and without debiasing) with different regularization parameters to 100 AEF data sets generated by averaging randomly selected subsets of epochs. The mean GOF around the N100m component (from 90 ms to 150 ms) and the mean active set size are presented in Fig. 3 as a function of . We can see that the debiasing procedure has a strong effect on MxNE, whereas the GOF of the irMxNE result is only slightly improved indicating less amplitude bias. Debiased MxNE and irMxNE yield similar GOFs with similar plateaus, but irMxNE provides a sparser, i.e., simpler model.

Fig. 3: Mean GOF, and active set size for MxNE and irMxNE without (solid) and with (dashed) debiasing for the AEF data.

The selection probability for all sources being, at least once, part of the active set obtained with MxNE or irMxNE is shown in Fig. 

4. MxNE selects multiple sources with high probability within each region of interest, which is a consequence of the correlated design. The irMxNE approach is more selective and provides sparser source estimates. Moreover, the number of false positives, i.e., sources outside of the regions of interest, is lower for irMxNE, particularly for low values of .

Fig. 4: Source selection probability for the AEF data set using MxNE (left) and irMxNE (right). The plot is restricted to sources that are active in at least one random subsample. The colored patches on the inflated brain indicate regions of interest based on anatomical labels (green, yellow, red). Source indices, which are in the regions of interest, are highlighted by corresponding color marks. The transversal temporal gyrus is indicated in green.

Exemplary source reconstructions for debiased MxNE and irMxNE are illustrated in Fig. 5. For comparison, we present a RAP-MUSIC estimate based on single- and two-dipole independent topographies [mosher-leahy:1999, mosher-leahy:1998]. The maximum dSPM score [Dale-etal:2000] per source is shown as an overlay on each cortical surface.

(a) MxNE with (b) MxNE with (c) RAP-MUSIC (2 independent topographies) (d) irMxNE with
Fig. 5: Source reconstruction results using AEFs evoked by left auditory stimulation. The estimated source locations for MxNE (a, b), RAP-MUSIC (c) and irMxNE (d), indicated by colored spheres, and the corresponding time courses are color-coded. The maximum of the dSPM estimate per source, which is thresholded for visualization purposes, is shown as an overlay on each cortical surface.

MxNE with shows activation in both primary auditory cortices with main peaks around 110 ms corresponding to the N100m component. The activation on the right hemisphere is however split into two highly correlated dipoles, which are partly located on the wrong side of the Sylvian fissure. Increasing does not fix the latter issue, since dipoles in the left primary auditory cortex are eliminated before actually erasing spurious activity on the right hemisphere. The loss of the active source in the left auditory cortex is also indicated by the drop of the GOF in Fig. 3. The size of the signal subspace for RAP-MUSIC was estimated to be 50 by the thresholding procedure. Being based on an empirical estimate of the data covariance, this procedure tends to overselect the rank of the signal subspace [mosher-leahy:1999] and the RAP-MUSIC estimate depends on the correlation threshold. With the setting proposed in [mosher-leahy:1998], only two independent topographies, a single- and a two-dipole topography, yield sufficient subspace correlations. The dipoles are reconstructed close to the primary auditory cortex on both hemispheres. The GOF of the three-dipole model is 86.7%. Using single- and two-dipole topographies provides better RAP-MUSIC estimates than using only single-dipole topographies. The irMxNE with , which converged after 10 iterations, reconstructs single dipoles in both primary auditory cortices. Intuitively, the green and blue sources, which are the strongest sources according to MxNE with

, are favored at the next iteration of the reweighted scheme pruning out the source on the wrong side of the Sylvian fissure present in the MxNE result. The estimated source locations roughly match the peaks of the dSPM estimate. The source estimate obtained with dSPM or similar linear inverse methods (sLORETA, MNE, etc.) is however spatially smeared. To reduce the smearing of the dSPM estimate, one could increase the threshold, yet it would make it time dependent and certainly too high to see weaker sources. Alternatively, post-processing is generally required, e.g. by defining regions of interest, to improve interpretability. Note also that, in contrast to dSPM, source amplitudes obtained with irMxNE are moments of electrical dipoles expressed in nAm, which is similar to dipole fitting procedures 

[scherg-etal:85]. The GOF of the two-dipole model obtained with irMxNE is 81.9% and thus only slightly lower than the three-dipole model obtained with RAP-MUSIC.

Iii-B2 Somatosensory evoked fields

We applied MxNE and irMxNE (with and without debiasing) with different regularization parameters to 100 averaged random subsets of epochs of the SEF data set. The mean GOF and the corresponding active set size for MxNE and irMxNE (with and without debiasing) as a function of the regularization parameter are shown in Fig. 6. We can see again that irMxNE yields significantly sparser source estimates, which however allow for a better GOF compared to MxNE. The GOFs of the MxNE and irMxNE results with and without debiasing illustrate also that the irMxNE source estimates are less biased in amplitude.

Fig. 6: Mean GOF, and active set size for MxNE and irMxNE without (solid) and with (dashed) debiasing for the SEF data.

Fig. 7 presents the selection probability for all sources, which are non-zero in at least one MxNE or irMxNE estimate. The irMxNE typically selects only one source per region of interest for different values of . The number of false positives is also significantly lower. These results confirm the findings obtained from the AEF data set in section III-B1. The stability analysis reveals also that the source in the ipsilateral secondary somatosensory cortex (iS2) is less stable compared to the contralateral sources, which might be caused by its relatively weak field pattern [Sorrentino-etal:2009].

Fig. 7: Selection probability for sources obtained with MxNE (left) and irMxNE (right) for the SEF data. The plot is restricted to sources that are active in at least one random subsample. The colored patches on the inflated brain indicate regions of interest based on anatomical labels (green, yellow, red). Source indices, which are in the regions of interest, are highlighted by corresponding color marks.

Fig. 8 presents source reconstruction results obtained with MxNE and irMxNE for selected regularization parameters.

(a) MxNE with (b) MxNE with (c) RAP-MUSIC (4 independent topographies) (d) irMxNE with
Fig. 8: Source reconstruction results using SEFs evoked by electrical stimulation of the left median nerve. The estimated source locations for MxNE (a, b), RAP-MUSIC (c) and irMxNE (d), indicated by colored spheres, and the corresponding time courses are color-coded. The maximum of the dSPM estimate per source, which is thresholded for visualization purposes, is shown as an overlay on each cortical surface.

As in section III-B1, we show a RAP-MUSIC estimate and added the maximum dSPM score per source as an overlay to all subfigures. MxNE with reconstructs dipoles in the contralateral primary somatosensory cortex (cS1), the contralateral and ipsilateral secondary somatosensory cortices (cS2 and iS2), and the posterial parietal cortex (cPPC). The source locations roughly coincide with the main peaks of the dSPM estimate. As for the AEF data set, the source activation per region is split into several correlated dipoles. An increase of the regularization parameter results in a loss of physiologically meaningful source activity such as activation in iS2, which is visible in the dSPM estimate. The relevance of this activation is also indicated by the drop of the GOF in Fig. 6. The signal subspace estimation for RAP-MUSIC yields a signal subspace size of 43. Hence, the RAP-MUSIC estimate depends on the choice of the correlation threshold. For our settings, four independent topographies, two single- and two two-dipole topographies, are above the subspace correlation threshold. Dipoles are reconstructed in all relevant areas. The activation in cS1 is split into several dipoles, probably due to the fixed orientation source model applied in RAP-MUSIC. The GOF of this six-dipole model is 82,6%. The RAP-MUSIC estimate benefits from using single and two-dipole topographies. The irMxNE approach with converged after 14 reweightings. The resulting source estimate contains four single dipoles representing activation in each of the four regions. The GOF of the four-dipole model obtained with irMxNE is 81.4% and thus higher than the GOF of the corresponding MxNE estimate and only slightly lower than the GOF of the six-dipole model obtained with RAP-MUSIC.

Iv Discussion and Conclusion

In this work, we presented irMxNE, an MEG/EEG inverse solver based on regularized regression with a non-convex block-separable penalty. The non-convex optimization problem is solved by iteratively solving a sequence of weighted MxNE problems, which allows for fast algorithms and global convergence control at each iteration. We proposed a new algorithm for solving the MxNE surrogate problems combining BCD and a forward active set strategy, which significantly decreases the computation time compared to the original MxNE algorithm [gramfort-etal:2012]. This new algorithm makes the proposed iterative reweighted optimization scheme applicable for practical MEG/EEG applications. The approach is also applicable to other block-separable non-convex penalties such as the logarithmic penalty proposed in [Candes] by adapting the definition of the weights in Eq. (5). The irMxNE method is designed for offline source reconstruction, which is still the main application of MEG/EEG source imaging in research and clinical routine. However, we are aware of a growing interest in real-time brain monitoring [dinh:2015]. New techniques such as parallel BCD schemes [li-osher:2009], clustering approaches [dinh-etal:2015], and safe rules [fercoq-etal:2015] can help to further reduce the computation time. As proposed in [Candes, Gasso], the first iteration of irMxNE is equivalent to computing the standard MxNE. Consequently, the irMxNE result is at least as sparse as the MxNE estimate. The iterative reweighting procedure can thus be considered as a post-processing for MxNE improving source recovery, stability, and amplitude bias. This was confirmed by empirical results based on simulations and two MEG data sets. We attribute this to the spatial correlation and the poor conditioning of the forward operator in MEG/EEG source analysis. An alternative approach to improve the conditioning of the inverse problem based on clustering the columns of the gain matrix is presented in [dinh-etal:2015], which however affects the spatial resolution. The source locations reconstructed by irMxNE roughly coincided with the main peaks of the dSPM estimate, which demonstrate that the proposed inverse solver can present a simple and easy-to-interpret spatio-temporal picture of the active sources. The models reconstructed with RAP-MUSIC provided a slightly higher goodness of fit, but contained more active sources. We found that RAP-MUSIC benefits from using single-dipole and two-dipole independent topographies. The use of higher-order source models, which are limited to a small number of correlated sources, however significantly increases the computational complexity, particularly for source spaces with high resolution. Approaches improving the computation time of RAP-MUSIC are presented in [dinh:2015]. In contrast, irMxNE makes no assumption on the number of correlated sources. Its computation time is not dramatically affected by the resolution of the source space. Model selection for sparse source imaging approaches, which amounts here to choosing the regularization parameter, is a critical aspect. Automatic approaches based on minimizing the prediction error such as cross-validation tend to overestimate the number of active sources and increase the false positive rate. Here, we selected the regularization parameter based on the GOF and the size of the active set. A similar procedure is used e.g. in sequential dipole fitting. The development of an automatic model selection procedure for the proposed inverse solver is future work. In particular, approaches maximizing model stability are an interesting alternative [lim-yu:2015, sun-etal:2013]. Model selection is however a general issue in MEG/EEG source reconstruction. In our comparison with RAP-MUSIC, we found e.g. that the size of the signal subspace and the correlation threshold have a strong influence on the final source estimate. Due to the limited number of samples, we can only obtain an empirical estimate of the data covariance, which involves the risk of overestimating the size of the signal subspace using the thresholding procedure. The correlation threshold is used to switch to higher order source models and to account for noise components in the signal subspace. Similar to MxNE, irMxNE assumes that the locations of active sources is constant over time. Hence, it should be applied to data, for which this model assumption is approximately true, e.g., by selecting intervals of interest or applying a moving window approach. To go beyond stationary sources, the reconstruction of non-stationary focal source activation can be improved by applying sparsity constraints in the time-frequency domain such as in the TF-MxNE [gramfort-etal:2013]. Preliminary results on the application of non-convex regularization for such models based on iterative reweighting procedures were presented in [strohmeier-etal:2015]. The irMxNE solver is available in the MNE-Python package [mne-python].


The authors would like to thank M. S. Hämäläinen for providing the experimental MEG data sets.