HemCNN: Deep Learning enables decoding of fNIRS cortical signals in hand grip motor tasks

by   Pablo Ortega, et al.
Imperial College London

We solve the fNIRS left/right hand force decoding problem using a data-driven approach by using a convolutional neural network architecture, the HemCNN. We test HemCNN's decoding capabilities to decode in a streaming way the hand, left or right, from fNIRS data. HemCNN learned to detect which hand executed a grasp at a naturalistic hand action speed of 1Hz, outperforming standard methods. Since HemCNN does not require baseline correction and the convolution operation is invariant to time translations, our method can help to unlock fNIRS for a variety of real-time tasks. Mobile brain imaging and mobile brain machine interfacing can benefit from this to develop real-world neuroscience and practical human neural interfacing based on BOLD-like signals for the evaluation, assistance and rehabilitation of force generation, such as fusion of fNIRS with EEG signals.



There are no comments yet.


page 1


Deep Real-Time Decoding of bimanual grip force from EEG fNIRS

Non-invasive cortical neural interfaces have only achieved modest perfor...

Deep learning-based classification of fine hand movements from low frequency EEG

The classification of different fine hand movements from EEG signals rep...

Right-hand side decoding of Gabidulin code and applications

We discuss the decoding of Gabidulin and interleaved Gabidulin codes. We...

Post-hoc labeling of arbitrary EEG recordings for data-efficient evaluation of neural decoding methods

Many cognitive, sensory and motor processes have correlates in oscillato...

Correlating grip force signals from multiple sensors highlights prehensile control strategies in a complex task-user system

Wearable sensor systems with transmitting capabilities are currently emp...

Decoding ECoG signal into 3D hand translation using deep learning

Motor brain-computer interfaces (BCIs) are a promising technology that m...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Non-invasive neuroimaging techniques provide an easy way to study the human cortex in a variety of tasks and provide opportunities for decoding mental activity. Especially, in sensorimotor tasks that reflect on ecologically valid real-world tasks, these methods can help us to evaluate, rehabilitate, replace or assist the motor function after brain damage [ENRIQUEZGEPPERT20131].

We suggest that model-free and non-linear methods can be more suitable for capturing relevant hemodynamic response (HR) variability from functional near-infrared spectroscopy (fNIRS) during force generation. In particular, differences in force generation tasks may explain why some force-tasks evoke spatially-resolved HR that reveal hemispherical lateralisation differences, a result often sought in motor control. For example, finger pinches in [Nambu2009a] or finger tapping in [trakoolwilaiwan2017convolutional] achieve lateralisation results while other work using hand grips do not [shibuya2008quantification, Derosiere2014a, wriessnegger2017force]. These works have in common the use of linear techniques and feature engineering in their methodology. For a given task, linear techniques first require to compute an expected HR and then regress out all the variability in the fNIRS measurements that does not fit the expected HR. These linear techniques, like general linear models (GLM), have proven very useful in the reduction of these sources of noise [friston1994statistical, ye2009nirs]

. Nonetheless, they require previous knowledge of the relationship between the task under study and the HR in order to compute accurate HR predictions that capture the variability in the task. If the relationship of the HR with the task is unknown or inaccurate, the variability in the HR might end up being discarded as noise instead of being explained by the underlying neuronal process

[faisal2008noise]. This limitation can be particularly relevant in the study of motor control where it has been observed that the speed [kuboyama2005relationship] and the intensity [shibuya2008quantification, Nambu2009a, Derosiere2014a] of a motor execution modulate the HR yet the analytical relationships are unknown.

To advance the neuroimaging capabilities of fNIRS in human force decoding, we set out to demonstrate how Deep Learning (DL) can be used to improve the decoding and interpretation of the HR. Convolutional neural networks (CNN) have been introduced to decode brain signals [walker2015deep] and since then applied to fNIRS in [trakoolwilaiwan2017convolutional] to detect motor activity. However, they did not study force generation nor developed an architecture that reflects our understanding of cortical motor neuroscience. We present HemCNN, a DL approach based on CNN, and use it to decode spatio-temporally specific cortical activations during force generation. We hypothesise that our HemCNN architecture is better than linear methods at processing highly variable HR evoked during fast hand-grips.

Ii Methods and Materials

Fig. 1: (A) Grip force pose, (B) fNIRS recording arrangement, (C) averaged grip signals.

Ii-a Protocol and task

We use a unimanual hand grip task and simultaneously record the fNIRS signals during its realisation in participants. The hand force grip task consists of periodic unimanual grips at an approximate rate of Hz. The contraction target was set in the maximum voluntary contraction (MVC) range. The grip force task was measured with a grip force transducer (Fig. 1A, PowerLab 4/25T, ADInstruments, Castle Hill, Australia) which was used to give visual feedback on contraction. Trials lasted

s followed by a randomised resting period uniformly distributed between

and seconds, so as to avoid phasic constructive interference of any systemic artefacts in the brain signals (Fig. 1C). Each trial was performed with either hand at different times. Participants performed a trials with each hand. All participants were healthy and right-handed. Handedness was confirmed with the Edinburgh inventory [oldfield1971assessment] for all participants. Imperial College Research Ethics Committee approved all procedures and all participants gave their written informed consent.

Brain fNIRS signals were recorded using an NIRScout system (NIRx Medizintechnik GmbH, Berlin). We used a total of 24 channels (10 sources and 8 detectors) sampling at Hz. These channels were symmetrically laid around C3 and C4 positions according to the International 10-20 system leaving an inter-optode distance of cm covering the sensorimotor cortex. Two wavelengths (nm, nm) continuous wave near-infrared spectroscopy was used to obtain optical absorption densities that were transformed to oxy-hemoglobin (HbO) and deoxy-hemoglobin concentrations (HbR) using the modified Beer-Lambert Law [cope1988methods]. HbO and HbR concentrations ( to refer to either concentration) were down-pass filtered below Hz, linearly detrended over a s period and downsampled to Hz.

Finally, breathing was recorded as the chest diametrical expansion measured by a transducer (Hz sampling rate, PowerLab 4/25T, ADInstruments). This additional signal was used to control for possible hand-use induced systemic artefacts that could contaminate the fNIRS signals.

The data used for this work has been made publicly available in the HYGRiP dataset [ortega2020hygrip].

Ii-B Baseline classification methods

The following conventional methods are used to compare and assess the performance of HemCNN. We extract (1) general linear model (GLM) features, (2) lateralisation indices (LI) and (3) concentration changes from the fNIRS signal. Each feature set is used to train an independent tree classifier following the same leave-one-subject out strategy than HemCNN.

In particular, GLM with canonical HRs (gamma functions) is the current standard in fNIRS analysis for denoising [ye2009nirs]. After GLM, each channel of each s trial segment is represented by a . Each element in represents approximately of the segment’s time-series. Therefore, the five features represent a chronologically ordered abstract representation of the time-series. The total dimension is ( HbO and HbR channels). We additionally used PCA to reduce this dimension in two hierarchical steps (GLM-hPCA). First across HbOHbR pairs and, second, per hemisphere. This process reduced the dimensionality to ( hemispheres).

LI are computed per each pair of symmetric channels leading to a 24-dimensional feature per example. indicates right hemisphere dominance and left hemisphere dominance.

Ii-C HemCNN design and training

CNN optimise convolutional filters (CF) to extract relevant features for a machine learning task, in our case, classification. We designed the HemCNN architecture and its training to enhance hemispherical differences between HRs corresponding to the left or the right hand generated force.

Specifically, the classifier corresponds to a mapping

. Each dimension in the output space represents the logarithmic probability of an fNIRS example (

) belonging to the left or right-hand generated force, .

Constraints are introduced in the architecture in the form of CF parameters that only allow interaction in the Hb, time and channel dimensions of the fNIRS signals. In particular, each HemCNN output value represents a hand and corresponds to the ipsilateral hemisphere. Classification decisions are exclusively based on hemispherical differences. Furthermore, each hand activity can be easily traced back through the layers to each hemisphere.

Fig. 2

shows our HemCNN architecture design with the fNIRS input, 4 CF leading to 3 corresponding convolutional layers (CL) and one output layer with two output units representing both hands. The CF convolve their respective input space leading to an output that is reduced in dimension depending on the filter size. We tailor the shapes of the filters and their stride to introduce constraints in the dimensions in which the CNN will be able to find relationships. Rows correspond to channels and columns to sampling time-steps. In a CF, the kernel,

, is the set of weights that the input is convolved with, and the stride is the number of steps the kernel shifts per row and column during the convolution. In Table I HemCNN kernel sizes and strides are shown. In particular, our HemCNN implementation preserves the brain hemispherical origin of information throughout all the layers.

We use the cross-entropy loss and the Adam optimiser [kingma2014adam] to train the network with a weight decay. The training batches contain left/right balanced shuffled examples and the learning rate starts at . Training is iterated for epochs (i.e. times over the augmented training-set) with the learning rate decaying at a rate per epoch.

During training, the drop-out technique is used to regularise the training [srivastava2014dropout]. Zeroing masks are applied to the fNIRS inputs affecting one randomly selected brain hemisphere with equal probability. This allows us to consistently obtain models that better reflect the contrast between hemispherical activities for each hand.

The fNIRS time-series are z-scored per trial using the mean and standard deviation of the whole channel set per Hb type. In contrast to the baseline conventional approaches, the HRs are not baseline corrected. Finally, the training set is also

fold augmented by (1) randomly cropping s time windows from an initial s trial window starting at the go-cue and (2) by multiplying each trial by a random number drawn from a uniform distribution between and .

Fig. 2: HemCNN architecture. From left to right: Channels in the input are arranged per row as HbOHbR consecutive pairs. The upper half of the rows corresponding to the left hemisphere. The bottom half corresponds to the right hemisphere. Columns correspond to time steps. Each convolutional filter (CF) renders a deeper convolutional layer activation (CL) after convolution. Hand activities can be unequivocally traced back to throughout all layers.
kernel ()
stride ()
Total 94
TABLE I: Kernel shape (rows columns) and stride (rows columns)

We test HemCNN following a leave-one-subject-out (LOSO) testing approach. LOSO is a particularly demanding generalisation condition that requires the decoding to operate on subjects who rarely share the same anatomy and sensor placements with the training subjects [xiloyannis2015gaussian]. For each subject balanced left/right-hand trials are used as training-set. For each left-out subject, of data from the remaining subjects is used for training and for validation. Training and validation data are split randomly every run and runs per left-out subject are executed delivering different models per left-out subject. Each model has a different random initialisation and trains for epochs. Finally, every model is trained with augmented examples. Training time was under min/subject in a GeForce RTX2080. The model with highest validation accuracy is selected and the test trials of the left-out subject are used to produce all results per subject.

Iii Results

To evaluate the performance of HemCNN resolving the HR in fNIRS we used the data of participants from HYGRiP performing a left/right hand-grip task. The fNIRS signals are used in this study. HemCNN is trained to detect what hand (left or right) is being used from these HRs as in a BCI decoding paradigm.

Fig. 3: Leave-one-subject-out test accuracy median across subjects (left) and per subject (right) for the best validation model out of trained models. Cold colours denote CNN architectures: HemCNN with HbO (HemCNNo), HbR (HemCNNr) or otherwise both inputs (HemCNN). Warm colours denote standard methods to extract features from HbO and HbR which are classified using a simple tree: general linear model (GLM) features, GLM with hierarchical PCA features (GLM-hPCA), features and LI features.

We first measure the ability to distinguish between HR evoked by the left or the right hand grips using the classification accuracy of the decoding methods. Figure 3 presents the hand test classification accuracy for HemCNN, GLM, and GLM-hPCA along with that of the traditional and LI features. Except HemCNN, the remaining methods use a tree classifier with the respective features: GLM, GLM-hPCA, or LI. The figure shows that HemCNN methods outperform GLM, GLM-hPCA,

and LI approaches (Kruskal-Wallis test, Tukey correction for pairwise comparison,

). The use of both HbO and HbR compared to only using one of them does not lead to significant differences although the absence of HbO in HemCNNr leads to a drop in accuracy (median accuracies, HemCNN , HemCNNo and HemCNNr , ). Non HemCNN methods, except LI, show similar performances around median accuracies. In particular, the addition of hierarchical PCA in GLM leads to the highest performance in non HemCNN methods (subject 1, accuracy ). LI performs with a random median accuracy level (). The low performance of non HemCNN approaches show that the classification rules they learn do not generalise well to unseen subjects. In contrast, HemCNN, provides higher accuracies with stable generalisation across subjects. This suggests that our proposed method learns HR features that are physiologically shared across subjects. Moreover, our HemCNN design is particularly suited for interpretation. On the one hand, it allows the activity of each hand to be traced back through each layer to its hemispherical hemodynamic activity. On the other hand, each CF has an unambiguous signal processing role inside the network. For example, CF1 convolves HbOHbR pairs and CF4 convolve cross channel activities. Note that a standard CNN with 3 fully connected layers and 4 convolutional filters per convolutional layer achieved a median accuracy of which was not significantly different to that of HemCNN (, Kruskal-Wallis, Tukey corrected test, ). However, compared to HemCNN, such architecture would not allow to unequivocally trace back the activity of each output. Instead, these activities would be sparsely distributed across different filters and fully connected units and would not guarantee that the classification is based, exclusively, in interhemispheric differences. Thus we did not include the standard CNN in the full analysis.

Finally, when the breathing signal was used with an architecture similar to HemCNN to decode the used hand, the resulting accuracy was at chance level (). The lack of accuracy of the breathing rate to decode what hand was used shows that this systemic artefact does not carry hand related information during the task and is unlikely informing the fNIRS based decoding of HemCNN.

Iv Discussion

We presented a novel DL approach, HemCNN, to perform decoding and neural data analysis. Our left/right hand grip task evokes highly variable and non-stationary hemodynamic responses [ortega2020hygrip]. Our HemCNN method outperformed currently used methods, specially GLM, at decoding the hand used. In particular, HemCNN detected the hand used in our task based only in hemispherical differences.

Previous studies using a similar task could not resolve these differences [Derosiere2014a] using linear models or feature engineering or took more than s of stimulation to detect them [shibuya2008quantification]. In contrast, we do not to make any assumptions on the HR features present during force generation and developed a CNN for fNIRS time-series decoding called HemCNN. In particular, HemCNN performs significantly better than (, , Fig. 3) GLM features [ye2009nirs] and other conventional engineered features (Lateralisation indices and hemoglobin concentration changes).

Our analysis demonstrates that differences in fNIRS signals exist across hands but require appropriate feature learning to be detected. Furthermore, current understanding tend towards both hemispheres acting together to produce complex unimanual and bimanual movements [haar2017effector, Ames2019]. We show here that lateralised temporal brain activity patterns can be exploited for lateralised and fast hand grips decoding in fNIRS. We use a light-weight data driven method that simultaneously optimises the localisation of hemodynamic response transients, their time alignment, and their channel relationship by means of the introduction of non-linearities.

Our results for fNIRS force decoding brings us closer to the range of ability of machine learning based force decoding from peripheral sensory signals [gavriel2014comparison, fara2014prediction].

V Conclusion

In this work, we introduced an interpretable CNN based method to temporally and spatially resolve highly variable cortical hemodynamic responses during a demanding motor task. In particular, HemCNN can be directly applied in real-time analysis since: the convolutional operator is invariant to time translations, the differences are found across time-samples and channels, and there is no strong preprocessing other than the z-scoring of the fNIRS example. This makes it appropriate not only for offline neuroimaging but also for BCI paradigms.


We thank Tong Zhao for support in data recording, and EPSRC for financial through HiPEDS CDT (EP/L016796/1) and an EPSRC capital equipment grant.