Know Your Mind: Adaptive Brain Signal Classification with Reinforced Attentive Convolutional Neural Networks

02/12/2018 ∙ by Xiang Zhang, et al. ∙ UNSW Tsinghua University 0

Electroencephalography (EEG) signals reflect activities on certain brain areas. Effective classification of time-varying EEG signals is still challenging. First, EEG signal processing and feature engineer- ing are time-consuming and highly rely on expert knowledge. In addition, most existing studies focus on domain-specific classifi- cation algorithms which may not be applicable to other domains. Moreover, the EEG signal usually has a low signal-to-noise ratio and can be easily corrupted. In this regard, we propose a generic EEG signal classification framework that accommodates a wide range of applications to address the aforementioned issues. The proposed framework develops a reinforced selective attention model to auto- matically choose the distinctive information among the raw EEG signals. A convolutional mapping operation is employed to dy- namically transform the selected information to an over-complete feature space, wherein implicit spatial dependency of EEG samples distribution is able to be uncovered. We demonstrate the effec- tiveness of the proposed framework using three representative scenarios: intention recognition with motor imagery EEG, person identification, and neurological diagnosis. Three widely used public datasets and a local dataset are used for our evaluation. The experi- ments show that our framework outperforms the state-of-the-art baselines and achieves the accuracy of more than 97 the datasets with low latency and good resilience of handling complex EEG signals across various domains. These results confirm the suit- ability of the proposed generic approach for a range of problems in the realm of Brain-Computer Interface applications.



There are no comments yet.


page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

EEG (Electroencephalography) is an electro-physiological monitoring indicator to analyze brain states and activities by measuring the voltage fluctuations of ionic current within the neurons of brains. In practice, EEG signal can be collected by portable and off-the-shelf equipment in a non-invasive and non-stationary way

(Adeli et al., 2007). EEG signal classification algorithms have been widely studied for a range of real-world applications (Zhang et al., 2018b). The accurate and robust of EEG classification model has promising practical meanings to many applications such as intention recognition, person identification, and neurological diagnosis. Intention recognition based Brain-Computer Interface (BCI) system (Bai et al., 2017; Vallabhaneni et al., 2005) provides a novel bridge between human mind and the outer world and is recently used in assisted living (Zhang et al., 2018a), smart home (Zhang et al., 2017), and entertainment industry (Russoniello et al., 2009); EEG based person identification technique empowers the security systems deployed in bank or customs (Schetinin et al., 2017); EEG signal based neurological diagnosis can be used to detect the organic brain injury and abnormal synchronous neuronal activity such as epileptic seizure (Veeriah et al., 2015; Acar et al., 2007).

The classification of EEG signal faces several challenges. First, various data preprocessing and feature extraction methods (e.g., filtering, Discrete Wavelet Transformation, and feature selection) are employed by most existing EEG classification studies

(Zhang et al., 2018b; Russoniello et al., 2009). Nevertheless, all of these methods are time-consuming and highly depend on expertise. Meanwhile, the hand-crafted features require extensive experiments to generalize well to diverse settings such as filtering bands and wavelet orders. Therefore, a method directly works on raw EEG data is necessary.

Second, the current work focuses the classification of EEG segment instead of EEG sample . The segment contains a number of continuous EEG samples, which is generally clipped by sliding window method (Fraschini et al., 2015). The features for classification are measured based on each segment instead of each sample (Zhang et al., 2018b). However, the segment based classification has two drawbacks compared to sample based classification: 1) the segment based classification requires larger training data scale and longer data-collecting time. For example, suppose each segment has 10 samples without overlapping, for the same training batch size, the segment based classification requires 10 times datasize and data-collecting time of the sample based classification. 2) in the segment with a number of samples, the sample diversity may be offset by other inverse changed samples since EEG signal is rapidly varying (Section 2). As a result, the segment based classification cannot exploit the immediate intention changing and thus achieves low precision in practical deployment. To this end, the sample based classification is more attractive.

(a) T-2
(b) T-1
(c) T
(d) T+1
(e) T+2
Figure 1. EEG topography with continuous samples. The interval among samples is 0.00625 second.
(a) Frequency Domain Comparison
(b) Time Domain STD
(c) Time Domain Range
(d) Correlation Coefficient STD
(e) Correlation Coefficient Range
(f) Difference Percentage
Figure 2.

EEG characteristics analysis on the frequency domain, the time domain, the correlationship, and the difference percentage comparison. STD denotes standard deviation.

Besides, most current EEG classification methods are designed based domain-specific knowledge and thus may become ineffective or even fail in different scenarios and different data quality (Adeli et al., 2007). For example, the approach customized for neurological diagnosis may not work well on intention recognition. Therefore, a general EEG signal classification method is expected to have both efficiency and robustness over various domains.

Furthermore, compared to other sensor signals and images, the EEG signal is less informative. The EEG signal has low signal-to-noise ratio and is easy to be corrupted by subjective and objective factors (Zhang et al., 2018a). The less informative directly affect the classification accuracy. Hence, a more effective classifier is required to extract discriminative features from such limited information.

To address the aforementioned issues, we propose a novel framework for EEG single classification. The proposed framework directly works on raw EEG signal and exploits the intra-sample information of each single EEG sample. We design a reinforced selective attention model that combines the benefits of attention mechanism and deep reinforcement learning to automatically grasp the distinctive information for different EEG application circumstance. A convolutional mapping method is employed to mapping the selected low dimension feature to an over-complete feature space. The higher feature dimension enables the classifier to detect a distribution that exactly describes the EEG samples more easily. The main contributions of this paper are highlighted as follows:

  • We propose a general EEG classification framework for devise application scenarios including intention recognition, person identification, and neurological diagnosis.

  • We design the reinforced selective attention model, by combining the deep reinforcement learning and attention mechanism, to automatically extract the robust and distinct feature from various EEG scenarios. We design a non-linear reward function to encourage the model to select the best attention area that leads to the highest classification accuracy. Besides, we customize the state and actions based on our EEG classification environment.

  • We develop a convolutional mapping method to explore the distinguishable spatial dependency, which is fed to the classifier for classification, among selected EEG signals.

  • We demonstrate the effectiveness of the proposed framework using four EEG datasets with respect to three representatives and challenging BCI application scenarios. The experiment results illustrate that the proposed framework outperforms all the state-of-the-art baselines and consistently achieves the accuracy of more than 97% and low latency on all the datasets.

Note that all the necessary reusable codes and datasets have been open-sourced for reproduction, please refer to this link111

2. Analysis of EEG Signals

To gain a better understanding of EEG data, we analyze the unique characteristics of EEG signal. Intuitively, the EEG signal measures the voltage fluctuations of microcurrent produced by the brain neurons, which is easy to be affected by subjective and objective factors such as emotion, fatigue, and physical movement. The extreme artifacts caused by the factors leads to the low signal-to-noise ratio. Also, the EEG signal is less informative. The brain activity is very complex and rapid varying, but the EEG signal can only capture a few information through the discrete sampling of biological current. The characteristics of rapidly varying and complex of EEG signals can be demonstrated by Figure 1 which provides the EEG topography of continuous 5 samples. The sampling rate is 160 Hz while the sampling interval is 0.00625 second. It can be clearly observed that the topography changes dramatically within such a tiny time interval.

To be more intuitive, we quantitatively analyze the EEG signals’ unique characteristics by comparing EEG signals with two typical sensor signals collected by smartphone and wearable sensor in four aspects: the frequency domain, the time domain, the samples’ correlations, and the Difference Percentage (DP). The analysis in the frequency domain and the time domain investigates how chaotic the signals are in high-level, which also indicates how the data are corrupted by noises in a macroscopic perspective. The analysis in correlations and difference percentage are designed in order to investigate the EEG signal characteristics at the microscopic level such as how the EEG data fluctuate in a very short interval and how the EEG sample correlated with the neighbor samples.

Frequency Domain.

Fast Fourier transform (FFT) based frequency domain analysis is a commonly used signal characteristic analysis method. In this section, we evaluate the signals on four aspects: the Shannon information entropy

(Strait and Dewey, 1996), the frequency standard deviation (STD), the frequency range, and the signal bandwidth. The Shannon information entropy refers to the average amount of information produced by a stochastic source of signals. The higher information entropy, the more chaotic of the signals, and the more difficult to classify the signals. The frequency STD and the frequency range are two metrics to assess the dispersion and fluctuation of the signals in the frequency domain. The bandwidth denotes the width of the active frequency band. In this part, we define the active frequency band as the frequency with the amplitude larger than 0.002. The narrower bandwidth illustrates the more centralized frequency. The higher signal quality, the lower classification difficulty.

Time Domain. In the time domain, we evaluate the sensor signals on three levels of sample window: 5, 50, and 500 samples. Since the sampling rates of three signals all range in Hz, samples in the three levels of sample windows are collected in seconds, seconds, and seconds, respectively. The evaluation on three levels is expected to show the tendency that how the EEG characteristic varies with the sampling period. For each level, the standard deviation and range are calculated.

Correlation Coefficient. The correlation coefficient calculates the average cosine correlationship between the specific sample and its neighbor samples. Similarly, the correlation is evaluated on three sample window levels: 5 samples, 50 samples, 500 samples. The correlation coefficient measures the degree of EEG sample’s fluctuation at a microscopic level. A low correlation coefficient represents the EEG signal dramatically and rapidly varying all the time. The STD and range are calculated of each level as well.

Difference Percentage. The difference percentage evaluates the discrepancy between EEG signals and other sensor signals. Four metrics are measured: the STD in the time domain (T-STD), the range in the time domain (T-Range), the STD in the correlation coefficient (C-STD), and the range in the correlation coefficient (C-Range). For each metric, the DP is calculated for three sample window levels (5 samples, 50 samples, 500 samples), separately. The DP is calculated as follows:

where , , denote the metrics of EEG, smartphone, and wearable sensor signals, separately.

Figure 3. Flowchart of the proposed approach. The input raw EEG single sample (K denotes the -th element) is replicated and shuffled to provide more latent spatial combinations of feature dimensions. Then, an attention zone , which is a fragment in , with the state is selected. The selected attention zone is input to the state transition and the reward model. In each step , one action is selected by the state transition to update based on the agent’s feedback. The reward model evaluates the quality of the attention zone by the reward score . The dueling DQN is employed to discover the best attention zone which will be fed into the convolutional mapping procedure to extract the spatial dependency representation. The represented features will be used for the classification. denotes Fully Connected Layer. The reward model is the combination of the convolutional mapping and the classifier.

Results. The EEG characteristics analysis results are presented in Figure 2. Figure 1(a) provides the frequency domain comparison on Shannon information entropy, STD, range, and bandwidth, separately. In all of the four aspects, the assessments of EEG signals are higher than the smartphone and wearable sensor signals. These observations show that, compared to the signals of other typical sensors, EEG signals are more chaotic in the frequency domain and distributed in a wider frequency band. Figure 1(b) and  1(c) present the STD and range values in the time domain. The histograms show that the EEG signals are more disordered in the time domain. From Figure 1(d) and 1(e), we observe the EEG signals have higher correlation coefficient STD and range on all the sample window levels. This witness demonstrates that the EEG sample has more unstable correlations with neighbors and the instability is very high even in the nearest 5 samples. More specifically, the EEG signal is rapidly changing at each single sampling point and each sample. At last, Figure 1(f) provides that for all the metrics, the window with fewer samples has higher difference percentage. In the time domain, the DP in 5 sample window (250%) is dramatically higher than in other windows. It can be inferred that in long sample windows (e.g, 50, 500) which including a stack of samples, the sample diversity is offset by other inverse changed samples; however, the sample fluctuation is much sharper and intense in the short sample window for it only contains limited samples. These phenomena reveal that the single EEG samples may refer to different intention and should be classified into different categories from its neighbor samples. For example, the subject can transfer from one thought to another thought immediately within 0.01 sec, intuitively. This requires an algorithm to capture and recognize the subject’s intention instantaneously based on the EEG signal collected in a very short time.

In summary, we draw several conclusions from the above analysis: 1) the EEG signals are more chaotic and more difficult to be classified than other sensor signals; 2) the EEG signals vary more rapidly than other sensor signals and therefore require a quicker classification algorithm to support BCI applications.

3. Proposed Method

Based on the above analysis, we propose reinforced attentive convolutional neural networks to directly classify the raw EEG signal accurately and efficiently. The overall workflow is shown in Figure 

3. It contains several components: 1) the replicate and shuffle processing; 2) the reinforced selective attention model; and 3) the convolutional mapping. In the following, we will first discuss the motivations of the proposed method and then introduce the aforementioned components in detail.

Addressing the drawbacks mentioned in Section 1, the proposed approach: 1) automatically focuses on the valuable dimensions (channels) of different EEG dataset by the reinforced selective attention model; 2) performs classification based on single EEG sample; 3) develops a convolutional mapping method to exploit the latent spatial dependency among EEG signal dimensions.

First, in order to provide as much as possible information, we design an approach to exploit the spatial relationships among EEG signals. The signals belonging to different brain activities are supposed to have different spatial dependent relationships. We replicate and shuffle the input EEG signals on dimension-wise (Section 3.1). Within this method, all the possible dimension arrangements have the equiprobable appearance.

Second, inspired by the fact that the optimal spatial relationship only depends on a subset of feature dimensions, we introduce an attention zone to focus on a fragment of feature dimensions. Here, the attention zone is optimized by deep reinforcement learning, which has been proved to be stable and well-performed in policy learning (Section 3.2).

Third, we develop a deep CNN structure to learn the over-complete sparsity feature representation as well as the distinctive spatial dependency (Section 3.3).

3.1. Replicate and Shuffle

Suppose the input raw EEG data are denoted by , where denotes a single EEG sample and denotes the number of samples. In each sample, the feature contains elements corresponding to EEG channels. denotes the -th dimension value in the -th sample.

In real-world collection scenarios, the EEG data are generally concatenated following the distribution of biomedical EEG channels. However, the biomedical dimension order may not present the best spatial dependency, obviously. The exhausting method is too computationally expensive to exhaust all the possible dimension arrangements. For example, a 64-channel EEG sample has combinations, which is obviously an astronomical figure.

To provide more potential dimension combinations, we propose a method called Replicate and Shuffle (RS). RS is a two-step mapping method which mapping to a higher dimensional space with complete element combinations:

In the first step (Replicate), replicating for times ,where

denotes remainder operation. Then, we get a new vector with length as

which is not less than ; in the second step (Shuffle), we randomly shuffle the replicated vector in the first step and intercept the first elements to generate . Theoretically, compared to , contains more diverse dimension combinations.

3.2. Reinforced Selective Attention

In the next process, we aim to detect the optimal dimension combination, which includes the most distinctive spatial dependency among EEG signals. Since , the length of , is too large and computationally expensive, to balance the length and the information content, we introduce the attention mechanism (Cavanagh et al., 1992). We attempt to emphasize the informative fragment in and denote the fragment by , which is called attention zone. Let and denotes the length of the attention zone. We employ deep reinforcement learning to discover the best attention zone (Mnih et al., 2015).

As shown in Figure 3, the detection of the best attention zone includes two key components: the environment (including state transition and reward model) and the agent. Three elements (the state , the action , and the reward ) are exchanged in the interaction between the environment and the agent. All of the three elements are customized based on the specific situation in this paper. Next, we introduce the design of the crucial components of our deep reinforcement learning structure:

  • The state describes the position of the attention zone, where denotes the time stamp. In the training, is initialized as . Since the attention zone is a shifting fragment on 1-D , we design two parameters to define the state: , where and denote the start index and the end index of the attention zone222For example, for a random , the state is sufficient to determine the attention zone as ., separately.

  • The action describes which action the agent could choose to act on the environment. Here at time stamp , the state transition chooses one action to implement following the agent’s policy : . In our case, we define 4 categories of actions (Figure 4) for the attention zone: left shifting, right shifting, extend, and condense. For each action, the attention zone moves a random distance where is the upper boundary. For left shifting and right shifting actions, the attention zone shifts light-ward or right-ward with the step ; for the extend and condense actions, both and are moving . At last, if the state start index or end index is beyond the boundary, a clip operation is conducted. For example, if which is lower than the lower boundary , we clip the start index as .

  • The reward is calculated by the reward model, which will be detailed later. The reward model : receives the current state and returns an evaluation as the reward.

Figure 4. Four actions in the state transition. The dashed line indicates the attention zone before the action while the solid line with shadow indicates the position of the attention zone after the action.

Reward Model. Next, we introduce in detail the design of the reward model. The purpose of reward model is to evaluate how the current state impacts the classification performance. Intuitively, the state which leads to the better classification performance should have a higher reward: . In this paper, we set the reward modal as a combination of the convolutional mapping and classification (Section 3.3). Since in the practical approach optimization, the higher the accuracy is, the more difficult to increase the classification accuracy. For example, improving the accuracy on a higher level (e.g., from 90% to 100%) is much harder than on a lower level(e.g., from 50% to 60%). To encourage the accuracy improvement on the higher level, we design a non-linear reward function:

where denotes the classification accuracy. The function contains two parts, the first part is a normalized exponential function with the exponent , this part encourages the reinforcement learning algorithm to search the better which leads to a higher . The motivation of the exponential function is that: the reward growth rate is increasing with the accuracy’s increase333For example, for the same accuracy increment 10%, can earn a higher reward increment than .. The second part is a penalty factor for the attention zone length to keep the bar shorter and the is the penalty coefficient.

In summary, the aim of the deep reinforcement learning is to learn the optimal attention zone which leads to the maximum reward. The selective mechanism totally iterates times where and denote the number of episodes and steps (Wang et al., 2016), respectively. -greedy method (Tokic, 2010) is employed in the state transition. For better convergence and quicker training, the is gradually increasing with the iterating. The increment follows: .

Policy. The Dueling DQN (Deep Q Networks (Wang et al., 2016)) is employed as the optimization policy , which is enabled to learn the state-value function efficiently. Dueling DQN learns the Q value and the advantage function and combines them: . The primary reason we employ a dueling DQN to uncover the best attention zone is that it updates all the four Q values at every step while other policy only updates one Q value at each step.

3.3. Convolutional Mapping

For each attention zone, we further exploit the potential spatial dependency of EEG signals. Since we focus on the single sample, the EEG sample only contains a numerical vector with very limited information and is easily corrupted by noise. To amend this drawback, we attempt to mapping the EEG single sample from the original space to an over-complete sparsity space by a CNN structure.

In order to extract as more potential spatial dependencies as possible, we employ a convolutional layer (Krizhevsky et al., 2012) with a number of filters to scan on the learned attention zone

. The Relu non-linear activation function is applied to the convolutional outputs. The function of pooling layer is to reduce the redundant information in the convolutional outputs to decrease the computational cost. In our case, we try to keep as much information as possible. Therefore, pooling layer is not employed in our method.

Overall, the convolutional mapping structure contains 5 layers (as shown in Figure 3): the input layer receives the learned attention zone, the convolutional layer followed by two fully connected layers, and the output layer. The one-hot ground truth is compared to the output layer to calculate the training loss. In the convolutional mapping optimization, the -norm (with parameter

) is adopted as regularization to prevent overfitting. The sigmoid activation function is used on the fully connected layers. The cross-entropy loss function is optimized by the AdamOptimizer algorithm. The last fully connect layer is extracted as the represented features which are fed into a lightweight nearest neighbor classifier.

Labels Datasets
eegmmidb emotiv EEG-S TUH
0 eye closed eye closed Subject 1 Normal
1 left hand up arrow Subject 2 Seizure
2 right hand down arrow Subject 3 – –
3 both hands left arrow Subject 4 – –
4 both feet right arrow Subject 5 – –
5 – – central sign Subject 6 – –
6 – – – – Subject 7 – –
7 – – – – Subject 8 – –
Table 1. Label notation on four datasets

4. Experiments

In this section, we evaluate the proposed approach on four datasets belonging to diverse application scenarios, with focuses on accuracy, latency, and resilience.

Scenarios Datasets Metrics

Non-Deep Learning Baselines

Deep Learning Baselines
MIR eegmmidb Accuracy 0.5596 0.6996 0.5814 0.3043 0.5614 0.648 0.6786 0.99 0.9932
Precision 0.5538 0.7311 0.6056 0.2897 0.5617 0.6952 0.8873 0.9904 0.9933
Recall 0.5596 0.6996 0.5814 0.3043 0.5614 0.6446 0.6127 0.9904 0.9932
F1-score 0.5396 0.6738 0.5813 0.2037 0.5526 0.6619 0.7128 0.9903 0.9932
emotiv Accuracy 0.2569 0.8041 0.8539 0.2506 0.2595 0.2609 0.2521 0.725 0.9708
Precision 0.2737 0.8071 0.8563 0.2039 0.2761 0.2447 0.271 0.724 0.9708
Recall 0.2569 0.8041 0.8539 0.2506 0.2595 0.2348 0.2696 0.7237 0.9708
F1-score 0.2577 0.8048 0.8544 0.1557 0.2618 0.2354 0.2701 0.7238 0.9708
PI EEG-S Accuracy 0.6604 0.9619 0.9278 0.35 0.6681 0.9571 0.9821 0.998 0.9984
Precision 0.6551 0.9625 0.9336 0.3036 0.6779 0.9706 0.9858 0.998 0.9984
Recall 0.6604 0.962 0.9279 0.35 0.6681 0.9705 0.9857 0.998 0.9984
F1-score 0.6512 0.9621 0.9282 0.2877 0.668 0.9705 0.9857 0.998 0.9984
ND TUH Accuracy 0.7692 0.92 0.9192 0.5292 0.7675 0.6625 0.6625 0.9592 0.9975
Precision 0.7695 0.9206 0.923 0.7525 0.7675 0.6538 0.6985 0.9593 0.9975
Recall 0.7692 0.92 0.9192 0.5292 0.7675 0.6417 0.6583 0.9592 0.9975
F1-score 0.7692 0.9199 0.9188 0.3742 0.7675 0.6449 0.6685 0.9592 0.9975
Table 2. Comparison with baselines
No. 1 2 3 4 5 6
Method (or Rashid and Ahmad, 2016) (Zhang et al., 2018a) (Zhang et al., 2018b) (Alomari et al., 2014a) (Sita and Nair, 2013) (Alomari et al., 2014b)
Acc 0.92 0.9545 0.9831 0.8724 0.7497 0.845
No. 7 8 9 10 11 12
Method (Shenoy et al., 2015) (Szczuko, 2017) (Stefano Filho et al., 2017) (Pinheiro et al., 2016) (Kim et al., 2016) Ours
Acc 0.8206 0.8 0.9 0.8505 0.805 0.9932
No. 13 14 15 16 17 18
Method (Ma et al., 2015) (Yang and Deravi, 2014) (Rodrigues et al., 2016) (Fraschini et al., 2015) (Thomas and Vinod, 2016) Ours
Acc 0.88 0.99 0.87 0.956 0.9831 0.9984
No. 19 20 21 22 23 24
Method (Ziyabari et al., 2017) (Harati et al., 2015) (Zhang et al., 2018) (Goodwin and Harabagiu, 2017) (Golmohammadi et al., 2017b) Ours
Acc 0.923 0.9505 0.9951 0.914 0.95 0.9975
Table 3. Comparison with the state-of-the-art approaches

4.1. Application Scenarios and Datasets

4.1.1. Application Scenarios

In this section, we evaluate our approach on various datasets in three applications of EEG-based Brain-Computer Interfaces.

Movement Intention Recognition (MIR). EEG signal measures human brain activities, intuitively, different human intention will lead to diverse EEG patterns (Zhang et al., 2018a). The intention recognition plays a significant role in practical scenarios such as smart home, assisted living (Zhang et al., 2017), brain typing444 (Zhang et al., 2018a), and entertainment. For disabilities, who have troubles in motor abilities, and elders, intent recognition can help them to interact with external smart devices such as wheelchairs or service robots a real-time BCI systems. Besides, for people who lose the speaking ability, they may have a chance to express their thoughts with the help of accurate intention recognition technologies (e.g., brain typing). For health human being, intent recognition can be used in video game playing and other daily living applications.

Person Identification (PI). EEG based biometric identification (Schetinin et al., 2017) is an emerging approach, which applied in person identification. EEG-based person identification is highly attack-resilient. It has the unique advantage of avoiding or alleviating from the threat of being deceived which is often faced by other identification techniques. This technique can be deployed in identification and authentication scenarios such as bank security system and customs security check.

Neurological Diagnosis (ND). EEG signal collected in the unhealthy state differs significantly from the ones collected in the normal state with respect to frequency and pattern of neuronal firing (Adeli et al., 2007). Therefore, EEG signal has been used for neurological diagnosis for decades. For example, the epileptic seizure is a common brain disorder that affects about 1% of the population and its octal state could be detected by the EEG analysis of the patient.

4.1.2. Datasets

To evaluate how the proposed approach works in aforementioned application scenarios, we choose four EEG datasets with various collection equipment, sampling rate, and data source. For intention recognition task555In this paper, we use motor imagery EEG signal for intention recognition., we use a public dataset eegmmidb and a local collected dataset emotiv; for person identification task, we choose EEG-S dataset while the TUH dataset is utilized for neurological diagnosis.

eegmmidb. EEG motor movement/imagery database (eegmmidb)666 is collected by the BCI200 EEG system which records the brain signals using 64 channels with a sampling rate of 160Hz. EEG signal is recorded when the subject is imaging certain actions (without any physical action). In our dataset, the 560,000 samples belonging to 5 different labels and 20 subjects are selected with each subject having 28,000 samples. Each sample is a vector of 64 elements, each of which corresponds to one channel of the EEG data.

emotiv. This dataset is a local dataset collected by our group using EMOTIV Epoc+ EEG headset with 14 channels. The sampling rate is set as 128 Hz. During the experiment, the subject wearing the headset and imaging actions without physical movement. The experiment is carried on by 7 subjects (4 males and 5 females) aged from 24 to 30. This dataset contains 241,920 samples, belonging to 6 categories, with 34,560 samples for each subject. The signals in eegmmidb and emotiv are manually labeled according to the distinct action. The intention recognition is person dependent.

EEG-S. It is a subset of eegmmidb. Its device contains 64 channels with 160 sampling rate. The data are gathered while the subject is kept eye closing and stay relax. 8 subjects are involved and each subject generates 7,000 samples. The label in this dataset is the subjects’ ID.

TUH. TUH (Golmohammadi et al., 2017a) is a neurological seizure dataset of clinical EEG recordings. The EEG recording is associated with 22 channels from a 10/20 configuration. The sampling rate is set as 250 Hz. We select 12,000 samples from each of 5 subjects (2 males and 3 females). Half of the samples are labeled as epileptic seizure state. The remaining samples are labeled as normal state.

4.1.3. Parameter Settings

Each sample is one vector recording collected at each time point. The default settings of our approach is as follows. In the selective attention learning: , the Dueling DQN has 4 lays and the node number in each layer are: 2 (input layer), 32 (FCL), 4 () + 1 (), 4 (output). The decay parameter , , , , , learning rate, memory size , length penalty coefficient

, and the minimum length of attention zone is set as 10. In the convolutional mapping, the node number in the input layer equals to the number of attention zone dimensions. In the convolutional layer: the stride has shape

, the filter size is set as

, the depth is 10, and the non-linear function is Relu. The padding method is zero-padding. No pooling layer is adopted. The followed fully connected layer has 100 nodes. The learning rate is

while the -norm coefficient equals to 0.001. The transformation is trained for 2000 iterations. In addition, the key parameters of the baselines are as follows: Linear SVM (

), Random Forest (RF,

), KNN (

). In LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit),

, other settings are the same as (Zhang et al., 2017). The CNN has the same structure and hyper-parameters setting with the convolutional mapping component in the proposed show.

(a) CM of eegmmidb
(b) CM of emotiv
(c) CM of EEG-S
(d) CM of TUH
(e) ROC of eegmmidb
(f) ROC of emotiv
(g) ROC of EEG-S
(h) ROC of TUH
Figure 5. Confusion matrix and ROC curves with AUC scores of each dataset. CM denotes confusion matrix. Note the x-axis of ROC curve is the log of False Positive Rate.

4.2. Results

4.2.1. Approach Comparison

To measure the accuracy of the proposed method, we compare it with a set of baseline methods including 5 non-deep learning baselines and 3 deep learning based baselines. Furthermore, we choose a number of competitive state-of-the-art algorithms for every single task, separately.

Tables 2 presents the classification metrics comparison between our approach and baselines (including Non-DL and DL baselines), where DL represents deep learning, AdaB denotes Adaptive Boosting, and LDA denotes Linear Discriminant Analysis. The results in Tables 2 illustrate that our approach achieves the highest accuracy on all the datasets. The confusion matrix and the ROC curves (including the AUC scores) of each dataset are reported in Figure 5. In addition, to further evaluate the performance of the proposed approach, we compared our framework with a few state-of-the-art methods which using the same dataset with us. The comparison results are provided in Table 3.

We could observe that our proposed framework consistently outperforms a set of widely used baseline methods and strong competitors in four diverse datasets. These datasets are collected using different EEG hardware, ranging from high-precision medical equipment to off-the-shelf EEG headset with different number of EEG channels. To be more specific, in MIR scenario, although the local dataset gains slightly lower accuracy than the public dataset, it still achieves the accuracy of 97.08%. The minor performance difference is caused by that the off-the-shelf commercial headset, compared to the medical equipment, has fewer channels and lower measuring precision. As for the seizure diagnosis in ND, set the normal state as impostor while the seizure state as genuine, our approach gains the False Acceptance Rate (FAR) of 0.0033 while the False Rejective Rate (FRR) of 0.0017. This yields the existing methods by a large margin (Acar et al., 2007; Goodwin and Harabagiu, 2017; Golmohammadi et al., 2017b; Harati et al., 2015).

Figure 6. Latency comparison
Figure 7. Different # of channels
Figure 8. Varying # of channel signals

4.2.2. Latency Analysis

Except for high accuracy of EEG signal classification, the low latency is another critical requirement for the success of real-world BCI applications.

In this section, we compare the latency of the proposed framework with several typical state-of-the-art algorithms and the results are presented in Figure 8. It is observed that our approach has competitive latency compared with other methods. The overall latency is less than 1 second. The deep learning based techniques in this work do not explicitly lead to extra latency. One of the main reasons may lie in that the reinforced selective attention has filtered out unnecessary information. To be more specific, the classification latency of the proposed framework is about 0.70.8 seconds, which mainly results from the convolutional mapping and the classifying procedure. The convolutional mapping only takes 0.05 sec on testing although it takes more than ten minutes on training. The latency caused by the classifier is around 0.7 seconds.

4.2.3. Resilience Evaluation

In this section, we focus on evaluating the resilience of proposed method in coping with various number of EEG signal channels, and incomplete EEG signals.

In practice, the number of EEG channels of EEG devices is diverse due to two reasons. First, different off-the-shelf or on-the-shelf devices have various channels numbers. Intuitively, the quality of signal and its containing information is directly associated with the number of channels. In the meantime, the devices with more channels usually are more expensive and less portable. For example, BCI2000 system supports 64-channel EEG signal collection in high-precision clinical applications; EMOTIV Epoc+ and insight with 14-channel and 5-channel, separately, are designed for contextualized research. We compare three datasets collected from 3 types of EEG collection devices. Figure 8 shows that the proposed method consistently achieves the accuracy of at least 97% on three datasets with 14, 22, 64 channels, and leads to the resilient performance.

On the other hand, incomplete EEG signal causes the degradation of BCI applications. It could happen when some electrical nodes are loosened because of weak maintenance of EEG devices. To investigate the robustness of incomplete EEG signal with missing channels, we also conduct experiments by randomly selecting part of a proportion of signal channels over four datasets. For example, by selecting 20% of channels on the emotiv dataset, the selected channel number is . The experiments results are shown in Figure 8 (0.4 denotes accuracy while 20% denotes the channel percentage used for training). The radar chart demonstrates that eegmmidb and EEG-S, both with 64 channels, can achieve competitive accuracy even with only 20% signal channels. In contrast, emotiv (14 channels) and TUH (22 channels) are highly depend on the channel numbers, and the reason is that they only remain 3 and 5 channels for 20% channel percentage, respectively. Through our experience, at least 8 EEG channels are required to achieve high accuracy within the proposed framework.

5. Conclusion

This paper proposes a generic and effective framework for raw EEG signal classification to support the development of BCI applications. The proposed approach directly works on raw EEG data without any preprocessing and feature engineering. Besides, it can automatically select the distinguishable feature dimensions for different EEG data and thus highly improve the universality and deployment potently. Since the number of EEG signal dimensions is required to achieve high performance, our experiments tuning experience reveals that the EEG signal should contain at least 8 dimensions to extract distinctive spatial feature (inter-dimension dependency). The replicate and shuffle process cannot always provide the best spatial dependency. Therefore, when the classification accuracy is not satisfied, repeating the replicate and shuffle procedure may help to enhance the additional performance. The experiment results demonstrate that the proposed approach achieves competitive performance on public datasets and a local dataset. Our approach is applicable to wider application scenarios such as intention recognition, person identification, and neurological diagnosis.


  • (1)
  • Acar et al. (2007) Evrim Acar, Canan Aykut Bingol, Haluk Bingol, Rasmus Bro, and Bulent Yener. 2007.

    Seizure recognition on epilepsy feature tensor. In

    Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE. IEEE, 4273–4276.
  • Adeli et al. (2007) Hojjat Adeli, Samanwoy Ghosh-Dastidar, and Nahid Dadmehr. 2007. A wavelet-chaos methodology for analysis of EEGs and EEG subbands to detect seizure and epilepsy. IEEE Transactions on Biomedical Engineering 54, 2 (2007), 205–211.
  • Alomari et al. (2014a) Mohammad H Alomari, Ayman AbuBaker, Aiman Turani, Ali M Baniyounes, and Adnan Manasreh. 2014a.

    EEG mouse: A machine learning-based brain computer interface.

    International Journal of Advanced Computer Science and Applications (IJACSA) 5, 4 (2014), 193–198.
  • Alomari et al. (2014b) Mohammad H Alomari, Ali M Baniyounes, and Emad A Awada. 2014b. EEG-Based Classification of Imagined Fists Movements using Machine Learning and Wavelet Transform Analysis. International Journal of Advancements in Electronics and Electrical Engineering 3, 3 (2014), 83–87.
  • Bai et al. (2017) Zilong Bai, Peter Walker, Anna Tschiffely, Fei Wang, and Ian Davidson. 2017. Unsupervised Network Discovery for Brain Imaging Data. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 55–64.
  • Cavanagh et al. (1992) Patrick Cavanagh and others. 1992. Attention-based motion perception. Science 257, 5076 (1992), 1563–1565.
  • Fraschini et al. (2015) Matteo Fraschini, Arjan Hillebrand, Matteo Demuru, Luca Didaci, and Gian Luca Marcialis. 2015.

    An EEG-based biometric system using eigenvector centrality in resting state brain networks.

    IEEE Signal Processing Letters 22, 6 (2015), 666–670.
  • Golmohammadi et al. (2017a) M Golmohammadi, V Shah, S Lopez, S Ziyabari, S Yang, J Camaratta, I Obeid, and J Picone. 2017a. The TUH EEG Seizure Corpus. In Proceedings of the American Clinical Neurophysiology Society Annual Meeting. 1.
  • Golmohammadi et al. (2017b) Meysam Golmohammadi, Amir Hossein Harati Nejad Torbati, Silvia Lopez de Diego, Iyad Obeid, and Joseph Picone. 2017b. Automatic Analysis of EEGs Using Big Data and Hybrid Deep Learning Architectures. arXiv preprint arXiv:1712.09771 (2017).
  • Goodwin and Harabagiu (2017) Travis R Goodwin and Sanda M Harabagiu. 2017. Deep Learning from EEG Reports for Inferring Underspecified Information. AMIA Summits on Translational Science Proceedings 2017 (2017), 112–121.
  • Harati et al. (2015) Amir Harati, Meysam Golmohammadi, Silvia Lopez, Iyad Obeid, and Joseph Picone. 2015. Improved EEG event classification using differential energy. In Signal Processing in Medicine and Biology Symposium (SPMB). IEEE, 1–4.
  • Kim et al. (2016) Youngjoo Kim, Jiwoo Ryu, Ko Keun Kim, Clive C Took, Danilo P Mandic, and Cheolsoo Park. 2016. Motor imagery classification using mu and beta rhythms of eeg with strong uncorrelating transform based complex common spatial patterns. Computational intelligence and neuroscience 2016 (2016), 1–14.
  • Krizhevsky et al. (2012) Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105.
  • Ma et al. (2015) Lan Ma, James W Minett, Thierry Blu, and William SY Wang. 2015. Resting state EEG-based biometrics for individual identification using convolutional neural networks. In Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, 2848–2851.
  • Mnih et al. (2015) Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, and others. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.
  • or Rashid and Ahmad (2016) Md Mamun or Rashid and Mohiuddin Ahmad. 2016. Classification of motor imagery hands movement using levenberg-marquardt algorithm based on statistical features of EEG signal. In Electrical Engineering and Information Communication Technology (ICEEICT), 2016 3rd International Conference on. IEEE, 1–6.
  • Pinheiro et al. (2016) Oberdan R Pinheiro, Lynn RG Alves, MFM Romero, and Josemar R de Souza. 2016. Wheelchair simulator game for training people with severe disabilities. In Technology and Innovation in Sports, Health and Wellbeing (TISHW), International Conference on. IEEE.
  • Rodrigues et al. (2016) Douglas Rodrigues, Gabriel FA Silva, João P Papa, Aparecido N Marana, and Xin-She Yang. 2016. EEG-based person identification through binary flower pollination algorithm. Expert Systems with Applications 62 (2016), 81–90.
  • Russoniello et al. (2009) Carmen V Russoniello, Kevin O’Brien, and Jennifer M Parks. 2009. The effectiveness of casual video games in improving mood and decreasing stress. Journal of CyberTherapy & Rehabilitation 2, 1 (2009), 53–66.
  • Schetinin et al. (2017) Vitaly Schetinin, Livija Jakaite, Ndifreke Nyah, Dusica Novakovic, and Wojtek Krzanowski. 2017. Feature Extraction with GMDH-Type Neural Networks for EEG-Based Person Identification. International journal of neural systems (IJNS) (2017), 1750064.
  • Shenoy et al. (2015) H Vikram Shenoy, AP Vinod, and Cuntai Guan. 2015.

    Shrinkage estimator based regularization for EEG motor imagery classification. In

    Information, Communications and Signal Processing (ICICS), 2015 10th International Conference on. IEEE.
  • Sita and Nair (2013) J Sita and GJ Nair. 2013. Feature extraction and classification of EEG signals for mapping motor area of the brain. In Control Communication and Computing (ICCC), 2013 International Conference on. IEEE, 463–468.
  • Stefano Filho et al. (2017) Carlos A Stefano Filho, Romis Attux, and Gabriela Castellano. 2017. EEG sensorimotor rhythms’ variation and functional connectivity measures during motor imagery: linear relations and classification approaches. PeerJ 5 (2017), e3983.
  • Strait and Dewey (1996) Bonnie J Strait and T Gregory Dewey. 1996. The Shannon information entropy of protein sequences. Biophysical journal 71, 1 (1996), 148–155.
  • Szczuko (2017) Piotr Szczuko. 2017. Real and imaginary motion classification based on rough set analysis of EEG signals for multimedia applications. Multimedia Tools and Applications 76, 24 (2017), 25697–25711.
  • Thomas and Vinod (2016) Kavitha P Thomas and A Prasad Vinod. 2016. Biometric identification of persons using sample entropy features of EEG during rest state. In Systems, Man, and Cybernetics (SMC), 2016 IEEE International Conference on. IEEE, 003487–003492.
  • Tokic (2010) Michel Tokic. 2010. Adaptive -greedy exploration in reinforcement learning based on value differences. In

    Annual Conference on Artificial Intelligence

    . Springer, 203–210.
  • Vallabhaneni et al. (2005) Anirudh Vallabhaneni, Tao Wang, and Bin He. 2005. Brain computer interface. In Neural engineering. Springer, 85–121.
  • Veeriah et al. (2015) Vivek Veeriah, Rohit Durvasula, and Guo-Jun Qi. 2015. Deep learning architecture with dynamically programmed layers for brain connectome prediction. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1205–1214.
  • Wang et al. (2016) Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Freitas. 2016. Dueling network architectures for deep reinforcement learning. ICML’16 Proceedings of the 33rd International Conference on International Conference on Machine Learning 48 (2016), 1995–2003.
  • Yang and Deravi (2014) Su Yang and Farzin Deravi. 2014. Novel HHT-based features for biometric identification using EEG signals. In Pattern Recognition (ICPR), 2014 22nd International Conference on. IEEE, 1922–1927.
  • Zhang et al. (2018b) Dalin Zhang, Lina Yao, Xiang Zhang, Sen Wang, Weitong Chen, and Robert Boots. 2018b.

    EEG-based Intention Recognition from Spatio-Temporal Representations via Cascade and Parallel Convolutional Recurrent Neural Networks.

    Association for the Advancement of Artificial Intelligence (AAAI) (2018).
  • Zhang et al. (2017) Xiang Zhang, Lina Yao, Chaoran Huang, Quan Z Sheng, and Xianzhi Wang. 2017. Intent Recognition in Smart Living Through Deep Recurrent Neural Networks. In International Conference on Neural Information Processing. Springer, 748–758.
  • Zhang et al. (2018a) Xiang Zhang, Lina Yao, Quan Z Sheng, Salil S Kanhere, Tao Gu, and Dalin Zhang. 2018a.

    Converting Your Thoughts to Texts: Enabling Brain Typing via Deep Feature Learning of EEG Signals.

    IEEE Internatinal Conference on Pervasive Computing and Communications. (PerCom) (2018).
  • Zhang et al. (2018) Yinda Zhang, Shuhan Yang, Yang Liu, Yexian Zhang, Bingfeng Han, and Fengfeng Zhou. 2018. Integration of 24 Feature Types to Accurately Detect and Predict Seizures Using Scalp EEG Signals. Sensors (Basel, Switzerland) 18, 5 (2018).
  • Ziyabari et al. (2017) Saeedeh Ziyabari, Vinit Shah, Meysam Golmohammadi, Iyad Obeid, and Joseph Picone. 2017. Objective evaluation metrics for automatic classification of EEG events. arXiv preprint arXiv:1712.10107 (2017).