EEG Classification based on Image Configuration in Social Anxiety Disorder

12/07/2018 ∙ by Lubna Shibly Mokatren, et al. ∙ 0

The problem of detecting the presence of Social Anxiety Disorder (SAD) using Electroencephalography (EEG) for classification has seen limited study and is addressed with a new approach that seeks to exploit the knowledge of EEG sensor spatial configuration. Two classification models, one which ignores the configuration (model 1) and one that exploits it with different interpolation methods (model 2), are studied. Performance of these two models is examined for analyzing 34 EEG data channels each consisting of five frequency bands and further decomposed with a filter bank. The data are collected from 64 subjects consisting of healthy controls and patients with SAD. Validity of our hypothesis that model 2 will significantly outperform model 1 is borne out in the results, with accuracy 6--7% higher for model 2 for each machine learning algorithm we investigated. Convolutional Neural Networks (CNN) were found to provide much better performance than SVM and kNNs.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Social Anxiety Disorder (SAD), world’s third largest mental health care problem, affects of the population [Richards(2017)]. It is characterized by extreme fear and avoidance of social situations and the fear of negative evaluations from others [Leichsenring and Leweke(2017)]. The diagnosis process of SAD was first characterized in 1980 by Diagnosis and Statistical Manual for Mental Disorders (DSM-III). However, the criteria evolved and the most recent description appears in the fifth edition (DSM-5) [Hofmann and Otto(2017)]. In the field of psychiatry, there is the need to pay considerable attention to the reliability and quality of the diagnostic process of SAD DSM-5 to give an accurate assessment of the disorder [Kraemer et al.(2012)Kraemer, Kupfer, Clarke, Narrow, and Regier].

Electroencephalography (EEG) is a useful mechanism for diagnosing mental disorders. EEG provides measurements of brain activities aquired using electrodes placed over the scalp. While other brain imaging techniques, such as positron emission tomography (PET) and functional magnetic resonance imaging (fMRI), are used in diagnosis, EEG has important attributes in that it captures the temporal activity of the brain and is affordable compared with other methods [Sanei and Chambers(2013)]. The EEG waveform is usually divided into five main frequency bands [Abo-Zahhad et al.(2015)Abo-Zahhad, Ahmed, and Seha]:
Delta (: up to 4 Hz) waves are generated during drowsiness and are the slowest, Theta (: 4-8 Hz) waves are observed during quiet focus or sleep, Alpha (: 8 - 15 Hz) waves are observed during relaxation with closed eyes, Beta (: 15-32 Hz) waves observed during normal consciousness and active thinking, and Gamma (: 32 Hz) associated with strong electrical signals caused by visual stimulation or information processing, learning, and perception. All the mentioned five frequency bands of the EEG signals are used in the analysis process for this study.

The use of EEG in SAD diagnosis has seen only limited study. The visual detection of differences in the EEG signals between SAD patient and control groups is impractical since the EEG activity appears to be similar in both. Therefore, automated detection using techniques such as machine learning is usually employed, which could lead to more precise diagnosis results and can be the first step for better connectivity analysis, pattern recognition process, and understanding treatment responses in SAD

[Moscovitch et al.(2011)Moscovitch, Santesso, Miskovic, McCabe, Antony, and Schmidt].

In the past, many classification algorithms were devised for using EEG data [Lotte et al.(2007)Lotte, Congedo, Lécuyer, Lamarche, and Arnaldi]

, such as, linear discriminant analysis, SVM, neural networks, nonlinear bayesian classifiers, kNN, hidden markov model, combination of classifiers, and others. However, none considered the spatial locations and configuration of the EEG channel sensors as a means to possibly achieving better accuracy in analysis or classification tasks. This was a key driving factor in our research. Two classification models, one which ignores the configuration (model 1) and one that exploits it with different interpolation methods (model 2), are considered in this study. We hypothesized that model 2 will significantly outperform model 1. Validity of our hypothesis is borne out in the results, with average accuracy

higher for model 2 for each machine learning algorithm we investigated. Convolutional Neural Networks (CNN) were found to provide much better performance than SVM and kNNs.

Ii Method

Ii-a EEG recording

The EEG data-set used in this paper was acquired in the Department of Psychiatry at University of Illinois at Chicago (UIC). The acquisition of multi-channel EEG is done using an Electro-Cap with electrodes positioned at 34 different locations i.e. channels. The data are gathered from a total of 64 subjects divided into control and SAD patients groups. For this study, the brain activity being analyzed is at resting state, without the introduction of stimuli or task instructions. The duration of the EEG recording varies from 2-7 minutes. The signals are sampled at 1024Hz sampling frequency.

Ii-B Data Preprocessing

EEG data are complex, as multiple processes take place simultaneously causing various artifacts and noise to appear, such as eye movements and EMG artifacts. Hence, it needs to be processed and cleaned for better interpolation during analysis. EEG data preprocessing stages are implemented using EEGLAB software. The frequencies of interest are in the range of 1-50 Hz, covering the five different frequency bands.

Ii-C Data Analysis

Nonstationary phenomena are present in EEG data due to the constant switching of the meta-stable states of neurons assembling during brain functioning, causing signal changes in the form of spikes and momentary events. In our data analysis in time and frequency domains, each channel signal is divided into windows in which the data are assumed to be stationary. After testing multiple sizes of window segments based on the sampling frequency and the statistical properties, a window size of 5120 samples was found to yield the best results for detection.

The analysis of the signal content of each of the five main EEG frequency bands can be utilized to estimate subjects’ cognition and emotional states. A dyadic wavelet packet transformation

[Fliege(1994)] is used for the decomposition of subbands corresponding to the five frequency bands. [0-4] Hz for band, [4-8] Hz for , [8-16] Hz for , [16-32] Hz for , and [32-52] Hz for .

The energy of the content of each windowed segment is computed for the five frequency bands separately. The analysis is based on the energy content of these signals represented in two different ways: concatenation of the channels of the five frequency bands and image-like 2D representation of the EEG channel locations. The latter method is discussed in Section 2.D. It should be noted that the outputs of the filter bank are ordered from highest frequency subband to the lowest.

Ii-D Image representation of the EEG data

The data are acquired using 34 electrodes placed over different areas on the scalp, Frontal (F), Central (C) Temporal (T), Parietal (P) and Occipital (O) as shown in Fig. 1. It is hypothesized that the location of the channels can provide improved detection accuracy in the analysis of the data. To examine this hypothesis, two main data models are examined. First, the 34 channels of the five frequency bands are concatenated by creating a energy matrix over each window, without accounting for the location of the channel electrodes. Second, a 3-D array of size is created, where the first two dimensions represent an image of pixels corresponding with the channels positioning over the scalp while the third dimension represents the five frequency bands. In the latter method, the locations that not exactly correspond to any of the 34 channels are filled using different interpolation techniques.

The layout of the channels’ location is given in Fig. 1. To construct an image-like representation of the electrodes layout, an image of size was created. To fill in the missing pixel energy values, the following interpolation method was used [Eckstein(1989)]: an interpolated value at point , only the samples for , which lay within a distance less than from point , are used to calculate interpolated value using the following weighted average.


where and represents the distance between points and . This method is called Inverse Distance Weighting (IDW). In areas where no energy values exist within a distance of i.e. ”Border Points” (BP) for abbreviation, the nearest value is simply repeated. is chosen empirically from a set of different distances.

Fig. 1: Layout of 34 electrodes on scalp

It should be noted that other interpolation methods were examined such as nearest neighbor interpolation, bilinear interpolation, cubic spline interpolation [Lehmann et al.(1999)Lehmann, Gonner, and Spitzer], and zero insertion at the border points (BP) with zero. However, they all result in inferior performance compared with the method mentioned above. Table IV in Section 4 summarizes the average performance of the suggested methods. The construction of the image in the second model is tested on the main deep network structure (CNN) described in Section 3.B.

Fig. 2: Convolutional neural network structure

Iii Experiments

After the data are collected and preprocessed, two different models are used as previously discussed. In the first model, the energy for the five frequency bands is calculated separately in each window of size 5120 samples. Hence, for each window an energy matrix of dimensions is constructed. There are thirty four channels or electrodes and five frequency bands. Each row in the matrix represents one channel. The second model is based on the image representation of the EEG data, as discussed in Section 2.D. For each window a 3D energy array of is built, where the third dimension represents the five frequency bands. For both models, each matrix is considered as a single sample for the training or testing data-set.

Iii-a Acquisition of training and testing EEG data

To train the network, a stratified 8-fold cross-validation is applied, the data are shuffled randomly and each fold is made by preserving the percentage of subjects from the two classes. 7 folds (56 subjects) are used for training and validation (about validation, training) and the remaining fold (8 subjects) is used for testing. The classifier is trained 8 separate times. In each trial a different fold is used for testing. Early stopping is applied by monitoring the validation loss to avoid overfitting. In both models, for every window of size , an energy matrix is constructed and considered as single sample, where . The samples are gathered for each of the training, validation and testing sets by sliding a moving window of size with no overlap of windows. The choice of sliding samples is made empirically as it yields better results when compared with other overlapping windows of shifts of , and . In summary, the data set of each subject was divided into multiple samples, and the samples collected for the training, validation and testing sets will never overlap. Samples are labeled 0 (negative) if collected from control group, and 1 (positive) if collected from SAD patients. The classification accuracy is the percentage of subjects classified correctly.

For every trial, each testing subject is evaluated as follows:


where , is number of samples classified as positive for subject , is total number of samples for subject . Since some of the EEG data contain unwanted signals, the threshold was lowered from 0.5 to 0.45.

Iii-B CNN Network Structure

Deep learning has significantly improved the performance for many problems compared with other machine learning algorithms [Deng et al.(2014)Deng, Yu, et al.]. Developing deep network-based solutions has been found to be effective in many applications. CNNs have turned out to be the most powerful deep learning architectures for image-related problems [Krizhevsky et al.(2012)Krizhevsky, Sutskever, and Hinton]. The input to the network is arrays of data containing energy values of subband signals, in windows of size 5120. A sequential model is built for this classification problem as shown in Fig. 2. The first layer is a 2D convolution with kernel size (convolution window)

, 64 output filters, and ReLu activation over the outputs, another 2D convolution layer with the same parameters followed. This is followed by a max-pooling layer with pool size of 2, followed by dropout with rate=0.25. The input is then flattened which is followed by a fully connected layer with 128 output dimension and ReLu activation. Another dropout is done with rate=0.2 followed by final fully connected layer with Softmax activation and output dimension that equals to 2. Dropout is a regularization method that is used to reduce over-fitting. Other networks with one or more convolutional layers are also tested. However, all produced an inferior performance compared with the proposed network. The parameters of this network were chosen using several different trials and the parameters that yielded the best performance were selected. It should be noted that batch normalization was applied to the input and the CNN layers to reduce the internal covariate shift

[Ioffe and Szegedy(2015)].

Iv Results

In this study, inputs constructed by the two main models are both tested, by feeding them to different machine learning algorithms or deep neural networks. Model 1 is built by concatenating the channels of the five frequency bands with no consideration for the spatial configuration of EEG electrodes. Model 2 takes the electrode configuration over the scalp into account. The two models are investigated for their classification performance using SVM, kNN, and a proposed CNN in Section 3.B.

In the proposed CNN, the confusion matrices for Model 1 and Model 2 can be seen in Table I and Table II

, respectively. The confusion matrix summarizes the prediction results. The first entry of the first column has the number of actual positive (i.e., SAD patients) predicted as patients, the second entry of the first column has the number of actual patients predicted as negative (i.e., healthy subjects), etc. The accuracy (ability to correctly classify a subject), sensitivity (ability to correctly identify patients when SAD is present), and specificity (ability to correctly identify control within a healthy group) are all higher for the proposed approach (Model 2), with overall classification accuracy of

for Model 2 and for Model 1. Thus, Model 2 that takes advantage of the location of the channels is found to provide superior performance.

In this classification task the EEG input data are also fed to SVM with radial basis kernel function (RBF) with kernel parameter using LIBSVM tool. The parameters were chosen by estimating the performance of SVM with different kernels and hyper-parameters tuning using the validation set. The data are also applied to k-NN with k=3, and to other deep networks structures but the proposed network structure stood out as superior in terms of overall classification accuracy, as seen in Table III. It should be noted that, in all cases, Model 2 gave significantly higher accuracy than Model 1, establishing the importance of 2D representation of the spatial configuration of EEG sensors.

Model 1 Actual Condition
Positive Negative Accuracy =
Positive 26 7 Sensitivity =
Negitive 6 25 Specifitty =
TABLE I: Confusion Matrix - Model 1
Model 2 Actual Condition
Positive Negative Accuracy =
Positive 29 5 Sensitivity =
Negitive 3 27 Specificity =
TABLE II: Confusion Matrix - Model 2
Classification Accuracy
Model Model 1 Model 2
Proposed CNN
TABLE III: Average classification performance

To represent the different channels as an image over the scalp in the second model, a few interpolation methods are tested, as mentioned in Section 2.D. The average performance of these methods is presented in Table IV. Showing that Inverse Average Weighted interpolation yielded the best results among the methods tried.

Interpolation Comparison
Interpolation method Average recognition perf.
Nearest Neighbor
IDW with 0 BP
Cubic spline
IDW with NN BP
TABLE IV: Average performance of different interpolation methods

V Conclusion

In this study, it is found experimentally that SAD patients can be identified based on their EEG signals. Furthermore, the spatial configuration of the EEG electrodes is used for the first time in a classification task. When taking advantage of the positions of the channels, the performance of the CNN network shows improvement in accuracy compared with the same CNN in which configuration is ignored.


  • [Richards(2017)] Thomas A. Richards. The social anxiety association, 2017. URL
  • [Leichsenring and Leweke(2017)] Falk Leichsenring and Frank Leweke. Social anxiety disorder. New England Journal of Medicine, 376(23):2255–2264, 2017.
  • [Hofmann and Otto(2017)] Stefan G Hofmann and Michael W Otto. Cognitive Behavioral Therapy for Social Anxiety Disorder: Evidence-Based and Disorder Specific Treatment Techniques. Routledge, 2017.
  • [Kraemer et al.(2012)Kraemer, Kupfer, Clarke, Narrow, and Regier] Helena Chmura Kraemer, David J Kupfer, Diana E Clarke, William E Narrow, and Darrel A Regier. Dsm-5: how reliable is reliable enough? American Journal of Psychiatry, 169(1):13–15, 2012.
  • [Sanei and Chambers(2013)] Saeid Sanei and Jonathon A Chambers. EEG signal processing. John Wiley & Sons, 2013.
  • [Abo-Zahhad et al.(2015)Abo-Zahhad, Ahmed, and Seha] M Abo-Zahhad, Sabah Ahmed, and Sherif Nagib Seha. A new eeg acquisition protocol for biometric identification using eye blinking signals. 07:48–54, 05 2015.
  • [Moscovitch et al.(2011)Moscovitch, Santesso, Miskovic, McCabe, Antony, and Schmidt] David A Moscovitch, Diane L Santesso, Vladimir Miskovic, Randi E McCabe, Martin M Antony, and Louis A Schmidt. Frontal eeg asymmetry and symptom response to cognitive behavioral therapy in patients with social anxiety disorder. Biological Psychology, 87(3):379–385, 2011.
  • [Lotte et al.(2007)Lotte, Congedo, Lécuyer, Lamarche, and Arnaldi] Fabien Lotte, Marco Congedo, Anatole Lécuyer, Fabrice Lamarche, and Bruno Arnaldi. A review of classification algorithms for eeg-based brain–computer interfaces. Journal of neural engineering, 4(2):R1, 2007.
  • [Fliege(1994)] Norbert J Fliege. Multirate digital signal processing, volume 994. John Wiley New York, 1994.
  • [Eckstein(1989)] Barbara Ann Eckstein. Evaluation of spline and weighted average interpolation algorithms. Computers & Geosciences, 15(1):79–94, 1989.
  • [Lehmann et al.(1999)Lehmann, Gonner, and Spitzer] Thomas Martin Lehmann, Claudia Gonner, and Klaus Spitzer. Survey: Interpolation methods in medical image processing. IEEE transactions on medical imaging, 18(11):1049–1075, 1999.
  • [Deng et al.(2014)Deng, Yu, et al.] Li Deng, Dong Yu, et al. Deep learning: methods and applications. Foundations and Trends® in Signal Processing, 7(3–4):197–387, 2014.
  • [Krizhevsky et al.(2012)Krizhevsky, Sutskever, and Hinton] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
  • [Ioffe and Szegedy(2015)] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.