EEG-Based Driver Drowsiness Estimation Using Convolutional Neural Networks

08/08/2018 ∙ by Yuqi Cui, et al. ∙ Huazhong University of Science u0026 Technology 0

Deep learning, including convolutional neural networks (CNNs), has started finding applications in brain-computer interfaces (BCIs). However, so far most such approaches focused on BCI classification problems. This paper extends EEGNet, a 3-layer CNN model for BCI classification, to BCI regression, and also utilizes a novel spectral meta-learner for regression (SMLR) approach to aggregate multiple EEGNets for improved performance. Our model uses the power spectral density (PSD) of EEG signals as the input. Compared with raw EEG inputs, the PSD inputs can reduce the computational cost significantly, yet achieve much better regression performance. Experiments on driver drowsiness estimation from EEG signals demonstrate the outstanding performance of our approach.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Drowsy driving is one of the most important causes of traffic accidents, following only to alcohol, speeding, and inattention [28]. As a result, it is very important to monitor the driver’s drowsiness level and take actions accordingly. There have been many different approaches [1, 6, 29, 22] for doing so, which can be roughly categorized into two groups:

  1. Contactless detection approaches

    , which do not require the driver to physically wear any sensors. Their main advantage is the convenience to use. Contactless detection approaches can be further classified into two categories:

    1. Computer vision based detection approaches, which can be applied to either the driver or the vehicle.

      When applied to the driver, a typical practice is to place some cameras behind the windshield, which capture the driver’s head in realtime. From the video we can compute the eye blink frequency [12, 21], the percentage of eye closure (PERCLOS) [31, 11], the eye movement [15, 16], the head pose [12, 27], etc, which are indicators of drowsiness. The main drawback of these approaches is that they can be easily affected by the lighting condition.

      When applied to the vehicle, usually some cameras are used to capture the relative position of the vehicle in the lane. From lane departure events we can estimate the driver drowsiness [15, 6, 29]. The main drawback of this approach is that it can also be easily affected by lighting and weather, and it may not work when the lane markers are unclear or missing.

    2. Driver-vehicle interaction based detection approaches, which use various sensors to measure the driving patterns, e.g., speeding, tailgating, abrupt braking, inappropriate steering wheel adjustments, etc [23, 29], to infer if the driver is drowsy.

  2. Contact sensor based detection approaches, which require the driver to physically wear some sensors to measure his/her physiological signals, e.g., electroencephalogram (EEG) [37, 26, 34, 35], electrocardiography [20, 26], electromyography [2, 19], respiration [30, 32], galvanic skin response [15, 5], etc. Theoretically, physiological signals are more accurate and reliable drowsiness indicators, as they originate directly from the human body. Their main disadvantages include: 1) the driver’s body movements may introduce artifacts and noise to the physiological signals, and hence reduce the detection accuracy; and, 2) the driver may feel uncomfortable to wear such body sensors.

This paper focuses on the contact sensor based detection approaches. More specifically, we consider EEG-based driver drowsiness detection. The main reason is that EEG signals, which directly measure the brain state, have the potential to predict the drowsiness before it reaches a dangerous level. Hence, compared with other approaches, there is ample time to alert the driver to avoid accidents.

There has been research on using deep learning [17, 18] for driver drowsiness classification. This paper considers regression instead of classification. It makes the following three contributions:

  1. It extends EEGNet [24], a convolutional neural network (CNN) originally designed for classification problems in brain-computer interface (BCI), to regression problems.

  2. It uses spectral meta-learner for regression (SMLR) [36], an unsupervised ensemble regression approach, to aggregate multiple EEGNet regression models for improved performance.

  3. Instead of using raw EEG signals as the input to EEGNet, it uses their power spectral density (PSD) at certain frequencies as the input, which significantly saves the computational cost, and also improves the regression performance.

The remainder of this paper is organized as follows: Section 2 introduces our proposed EEGNet-PSD-SMLR approach. Section 3 presents the details of a drowsy driving experiment in a virtual reality (VR) environment, and the performance comparison of EEGNet-PSD-SMLR with several other approaches. Finally, Section 4 draws conclusions and points out a future research direction.

2 The EEGNet-PSD-SMLR Model

This section introduces our proposed EEGNet-PSD-SMLR model for driver drowsiness estimation.

2.1 EEGNet for Regression

The CNN regression model used in this paper is modified from the EEGNet classification model [24], which has demonstrated outstanding performance in four different BCI applications, i.e., P300 visual-evoked potential, error-related negativity, movement-related cortical potential, and the sensory motor rhythm.

Denote an EEG epoch as

, where is the number of channels and is the number of time samples (or features) per channel. The EEGNet classification and regression architectures are given in Table 1, where

is the number of classes in classification. Observe that the two architectures are identical for the first three layers; the only difference occurs at the fourth layer. The EEGNet classification architecture uses softmax regression for classification, whereas the EEGNet regression architecture uses a dense layer followed by an activation layer for regression. We have tested different activation functions (ReLU, sigmoid, tanh, and linear), and found linear activation gave the best results. So, linear activation was adopted in this paper.

Layer Input Size Operation Output Size Number of Parameters
1 16Conv1D(C,1)
BatchNorm 32
2 4Conv2D(2,32)
BatchNorm 8
3 4Conv2D(8,4)
4 (Class.) Softmax Regression
4 (Regr.) Dense or
Total Classification
Table 1: EEGNet architectures for classification and regression.

2.2 SMLR for EEGNet Regression Model Aggregation

It’s well-known that neural network models can be easily trapped at local minima. Since the EEGNet regression model is compact and can be trained quickly, we can use ensemble learning to increase its robustness. More specifically, we train 10 different EEGNet regression models by bootstrapping, and then use SMLR [36] to aggregate them.

Consider a regression problem with a continuous value input space and a continuous value output space . Assume there are unlabeled samples, , with unknown true outputs , and base regression models, . The th regression model’s prediction for is . The goal of SMLR is to accurately estimate by optimally combining . As shown in Algorithm 1, SMLR consists of two steps: 1) estimate the accuracy of each base regression model; 2) select and combine the strong base regression models.

Input: unlabeled samples, ;
        base regression models, .
Output: The estimated outputs, .
Apply each to to obtain the estimates

and assemble them into a vector

Compute the covariance matrix of ;

Compute the first leading eigenvector,

, of ;
Perform -means clustering () on the absolute values of the elements of ;
Identify , the subset of the strong regression models, as those belong to the cluster with the maximum centroid;
Return ,  .
Algorithm 1 The SMLR algorithm [36].

3 Experiment and Results

3.1 Dataset

The experiment setup used in this paper was identical to that in [34, 36]. Sixteen healthy subjects with normal or corrected-to-normal vision were recruited to participant in a sustained-attention driving experiment [8, 7]

, which consisted of a real vehicle mounted on a motion platform with six degrees of freedom immersed in a 360-degree VR scene. Each subject performed the experiment for about 60-90 minutes in the afternoon when the circadian rhythm of sleepiness reached its peak. To induce drowsiness during driving, the VR scene simulated monotonous driving at 100 km/h on a straight and empty highway. During the experiment, random lane-departure events were introduced every 5-10 seconds, and participants were instructed to steer the vehicle to compensate for them immediately. Their response time was recorded and later converted to a drowsiness index (see the next subsection), as research has shown that it has strong correlation with fatigue

[21]. Participants’ scalp EEG signals were recorded using a 500Hz 32-channel Neuroscan system (30-channel EEGs plus 2-channel earlobes).

3.2 Preprocessing

The 16 subjects had different lengths of experiment, because the disturbances were presented randomly every 5-10 seconds. Data from one subject was not recorded correctly, so we used only 15 subjects. To ensure a fair comparison, we used the first 3,600 seconds data for each subject.

We defined a function [34, 36] to map the response time to a drowsiness index :


was used in this paper, as in [34, 36]. The drowsiness indices were then smoothed using a 90-second square moving-average window to reduce variations. This does not reduce the sensitivity of the drowsiness index because previous research showed that the cycle lengths of drowsiness fluctuations are longer than four minutes [25].

We used EEGLAB [10] for EEG signal preprocessing. A 1-50 Hz band-pass filter was applied to remove high-frequency muscle artifacts, line-noise contamination and direct current drift. Next the EEG data were downsampled from 500 Hz to 250 Hz and re-referenced to averaged earlobes.

We tried to predict the drowsiness index for each subject every 3 seconds. All 30 EEG channels were used in feature extraction. We epoched 30-second EEG signals right before each sample point, computed the power spectral density (PSD) in the theta and alpha bands (4-12 Hz) for each channel using Welch’s method


, and converted them into dBs. Each channel had 67 such PSD points at different frequencies. Some channels may have dBs significantly larger than others, which degraded the regression performance. So we removed channels which had at least one dB larger than 20, and normalized the dBs of all remaining channels to mean zero and standard deviation one. Assume the number of remaining channels is

(usually is about 30). Then, the input matrix to our EEGNet regression model has dimensionality .

3.3 Algorithms

We used data from 14 subjects to build a regression model for the 15th subject, simulating the scenario that we already collected data from 14 subjects and need to use their data to help estimate the drowsiness level for a new driver. We repeated this process 15 times so that each subject had a chance to be the “new” driver.

We compared the performance of the following five algorithms:

  1. Ridge regression based on principal component features (RR), which is the baseline. This method was first used in [34]. It combined data from all existing 14 subjects and extracted average PSDs in the theta band as features. Similar to the case in Section 3.2

    , some channels may have extremely large average PSDs, which were removed (using a 20 dB threshold) for better regression performance. We then normalized the dBs of each remaining channel to mean zero and standard deviation one, and extracted a few (usually around 10) leading principal components, which accounted for 95% of the variance. The projections of the dBs onto these principal components were then used as our features. At last we built a ridge regression model for the 15th subject.

  2. RR based on principal component features and SMLR (RR-SMLR). This is the method proposed in [36]. We built 14 RR models, each one using only one source subject’s data as the training dataset. Feature extraction was the same as in RR. After obtaining 14 models trained on different datasets, we used SMLR to aggregate them for the target subject.

  3. EEGNet regression model using band-passed EEG inputs (EEGNet), which used the EEGNet regression architecture described in Section2.1. EEG signals, after 1-50 Hz band-pass filtering, were used as input. So, the input dimensionality was (the second dimensionality was 7500 because we used 30-second EEG signals for estimation, and the sampling rate was 250 Hz).

  4. EEGNet regression model using the PSD features (EEGNet-PSD). The EEGNet regression architecture was identical to the one in EEGNet, but the PSD features described in Section 3.2 were used as its input.

  5. EEGNet-PSD with SMLR (EEGNet-PSD-SMLR), which was the above EEGNet-PSD model combined with SMLR ensemble learning, as described in Section 2.2.

Each algorithm was repeated 10 times so that statistical meaningful results can be obtained. The performance measures were the root mean square error (RMSE) and the correlation coefficient (CC), as in [34, 36].

3.4 Results and Discussions

The experimental results are shown in Fig. 1 and Table 2. Observe that:

  1. EEGNet, which used band-passed EEG signals as the input, had the worst RMSE and CC for most subjects and also on average. This is because the input feature had very large dimensionality ( in Table 1), so there were about 8820 parameters in this model. On the contrary, there were only training samples, which may not be enough to fully optimize these parameters.

  2. EEGNet-PSD, which had about 67 PSD points in each channel, achieved better RMSE and CC than both RR and EEGNet for most subjects. This demonstrates that the PSD features are better than the band-passed EEG temporal features. Because of the much smaller dimensionality, training time of EEGNet-PSD was also reduced significantly compared with EEGNet.

  3. EEGNet-PSD-SMLR, which is an ensemble of multiple EEGNet-PSD aggregated by the SMLR, achieved comparable performance with RR-SMLR, which was our best approach on this driving dataset. On average its RMSE was smaller than EEGNet-PSD, and its CC was larger than EEGNet-PSD. This suggests that SMLR can indeed improve the learning performance.

Figure 1: (a) RMSEs and (b) CCs of the five approaches on the 15 subjects. The last group in each subfigure shows the average performance across the 15 subjects.
RMSE 0.2587 0.2371 0.3208 0.2394
CC 0.5994 0.3499 0.6215 0.6379
Table 2: Average performances of the five algorithms on the 15 subjects.

We also performed a two-way Analysis of Variance (ANOVA) for the five algorithms to check if the RMSE and CC differences among them were statistically significant, by setting the subjects as a random effect. The results are shown in Table 3, which shows that there were statistically significant differences (at 5% level) for both RMSEs and CCs.

Table 3: -values of two-way ANOVA tests for the five algorithms.

Then, non-parametric multiple comparison tests based on Dunn’s procedure [13, 14] were used to determine if the difference between any pair of algorithms was statistically significant, with a -value correction using the False Discovery Rate method [4]. The -values are shown in Table 4, where the statistically significant ones are marked in bold. Observe that the RMSE differences and the CC differences between EEGNet-PSD-SMLR and RR/EEGNet were statistically significant, but the differences between EEGNet-PSD-SMLR and EEGNet-PSD/RR-SMLR were not.

RR-SMLR .0040
EEGNet .0000 .0000
RMSE EEGNet-PSD .0087 .3757 .0000
EEGNet-PSD-SMLR .0007 .3239 .0000 .2416
RR-SMLR .0015
EEGNet .0000 .0000
CC EEGNet-PSD .0731 .0767 .0000
EEGNet-PSD-SMLR .0055 .3226 .0000 .1550
Table 4: -values of non-parametric multiple comparisons for the five algorithms.

4 Conclusions

This paper focused on the much under-studied regression problems in BCI, particularly, driver drowsiness estimation from EEGs. It has extended EEGNet, a 3-layer CNN model for BCI classification, to BCI regression, and also utilized SMLR to aggregate multiple EEGNets for improved performance. Another novelty of our model is that it uses the PSD of EEG signals as the input, instead of raw EEG signals. In this way it can reduce the computational cost significantly, yet achieve much better regression performance. Experiments showed that EEGNet-PSD-SMLR achieved comparable performance with our best regression model proposed recently.

Recently Riemannian geometry features have demonstrated outstanding performance in several BCI classification applications [9, 3]. Our latest research [38] has also showed that Riemannian geometry features can outperform the traditional powerband features in an EEG-based BCI regression problem. Our future research will investigate Riemannian geometry features in the EEGNet and SMLR framework.


  • [1] Abbood, H., Al-Nuaimy, W., Al-Ataby, A., Salem, S.A., AlZubi, H.S.: Prediction of driver fatigue: Approaches and open challenges. In: Proc. 14th UK Workshop on Computational Intelligence. pp. 1–6. Bradford, UK (September 2014)
  • [2] Akin, M., Kurt, M.B., Sezgin, N., Bayram, M.: Estimating vigilance level by using EEG and EMG signals. Neural Computing and Applications 17(3), 227–236 (2008)
  • [3] Barachant, A., Bonnet, S., Congedo, M., Jutten, C.: Classification of covariance matrices using a Riemannian-based kernel for BCI applications. Neurocomputing 112, 172–178 (2013)
  • [4] Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological) 57, 289–300 (1995)
  • [5] Boon-Leng, L., Dae-Seok, L., Boon-Giin, L.: Mobile-based wearable-type of driver fatigue detection by GSR and EMG. In: Proc. IEEE Region 10 Conf. pp. 1–4 (November 2015)
  • [6] Chacon-Murguia, M.I., Prieto-Resendiz, C.: Detecting driver drowsiness: A survey of system designs and technology. IEEE Consumer Electronics Magazine 4(4), 107–119 (2015)
  • [7] Chuang, C.H., Ko, L.W., Jung, T.P., Lin, C.T.: Kinesthesia in a sustained-attention driving task. Neuroimage 91, 187–202 (2014)
  • [8] Chuang, S.W., Ko, L.W., Lin, Y.P., Huang, R.S., Jung, T.P., Lin, C.T.: Co-modulatory spectral changes in independent brain processes are correlated with task performance. Neuroimage 62, 1469–1477 (2012)
  • [9] Congedo, M., Barachant, A., Andreev, A.: A new generation of brain-computer interface based on Riemannian geometry. arXiv: 1310.8115 (2013)
  • [10]

    Delorme, A., Makeig, S.: EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods 134, 9–21 (2004)

  • [11] Dinges, D., Grace, R.: PERCLOS: A valid psychophysiological measure of alertness as assessed by psychomotor vigilance. Tech. Rep. FHWA-MCRT-98-006, US Department of Transportation, Federal highway Administration. (1998)
  • [12] Dinges, D.F., Mallis, M.M.: Evaluation of techniques for ocular measurement as an index of fatigue and as the basis for alertness management. Tech. Rep. DOT HS 808 762, National Highway Traffic Safety Administration (1998)
  • [13] Dunn, O.: Multiple comparisons among means. Journal of the American Statistical Association 56, 62–64 (1961)
  • [14] Dunn, O.: Multiple comparisons using rank sums. Technometrics 6, 214–252 (1964)
  • [15] Edwards, J.D., Sirois, W., Dawson, T., Aguirre, A., Davis, B., Trutschel, U.: Evaluation of fatigue management technologies using weighted feature matrix method. In: Proc. 4th Int’l Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design. pp. 146–152. Stevenson, WA (July 2007)
  • [16] Eriksson, M., Papanikotopoulos, N.P.: Eye-tracking for detection of driver fatigue. In: Proc. Conf. on Intelligent Transportation Systems. pp. 314–319 (November 1997)
  • [17] Hajinoroozi, M., Mao, Z., Huang, Y.: Prediction of driver’s drowsy and alert states from EEG signals with deep learning. In: Proc. 6th IEEE Int’l Workshop on Computational Advances in Multi-Sensor Adaptive Processing. pp. 493–496. Cancun, Mexico (December 2015)
  • [18] Hajinoroozi, M., Mao, Z., Jung, T.P., Lin, C.T., Huang, Y.: EEG-based prediction of driver’s cognitive performance by deep convolutional neural network. Signal Processing: Image Communication 47, 549–555 (2016)
  • [19]

    Hu, S., Zheng, G.: Driver drowsiness detection with eyelid related parameters by support vector machine. Expert Systems with Applications 36(4), 7651–7658 (2009)

  • [20] Jahn, G., Oehme, A., Krems, J.F., Gelau, C.: Peripheral detection as a workload measure in driving: effects of traffic complexity and route guidance system use in a driving study. Transportation Research Part F: Traffic Psychology and Behaviour 8(3), 255–275 (2005)
  • [21] Ji, Q., Zhu, Z., Lan, P.: Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE Trans. on Vehicular Technology 53(4), 1052–1068 (2004)
  • [22] Kang, H.B.: Various approaches for driver and driving behavior monitoring: A review. In: Proc. IEEE Int’l Conf. on Computer Vision Workshops. pp. 616–623 (December 2013)
  • [23] Krajewski, J., Sommer, D., Trutschel, U., Edwards, D., Golz, M.: Steering wheel behavior based estimation of fatigue. In: Proc. 5th Int’l Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design. pp. 118–124. Big Sky, Montana (June 2009)
  • [24] Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., Lance, B.J.: EEGNet: A compact convolutional network for EEG-based brain-computer interfaces. CoRR abs/1611.08024 (2016),
  • [25] Makeig, S., Inlow, M.: Lapses in alertness: Coherence of fluctuations in performance and EEG spectrum. Electroencephalography and Clinical Neurophysiology 86, 23–35 (1993)
  • [26] Michail, E., Kokonozi, A., Chouvarda, I., Maglaveras, N.: EEG and HRV markers of sleepiness and loss of control during car driving. In: Proc. 30th Annual Int’l Conf. of the IEEE Engineering in Medicine and Biology Society. pp. 2566–2569. Vancouver, BC, Canada (August 2008)
  • [27] Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation and augmented reality tracking: An integrated system and evaluation for monitoring driver awareness. IEEE Trans. on Intelligent Transportation Systems 11(2), 300–311 (2010)
  • [28] Sagberg, F., Jackson, P., Kruger, H.P., Muzer, A., Williams, A.: Fatigue, sleepiness and reduced alertness as risk factors in driving. Tech. Rep. TOI Report 739/2004, Institute of Transport Economics, Oslo (2004)
  • [29] Sahayadhas, A., Sundaraj, K., Murugappan, M.: Detecting driver drowsiness based on sensors: A review. Sensors 12(12), 16937–16953 (2012)
  • [30]

    Sharma, M.K., Bundele, M.M.: Design & analysis of K-means algorithm for cognitive fatigue detection in vehicular driver using respiration signal. In: Proc. IEEE Int’l Conf. on Electrical, Computer and Communication Technologies. pp. 1–6. Tamil Nadu, India (March 2015)

  • [31] Sommer, D., Golz, M.: Evaluation of PERCLOS based current fatigue monitoring technologies. In: Proc. Annual Int’l Conf. of the IEEE Engineering in Medicine and Biology. pp. 4456–4459. Buenos Aires, Argentina (August 2010)
  • [32] Tayibnapis, I.R., Koo, D.Y., Choi, M.K., Kwon, S.: A novel driver fatigue monitoring using optical imaging of face on safe driving system. In: Proc. Int’l Conf. on Control, Electronics, Renewable Energy and Communications. pp. 115–120 (September 2016)
  • [33]

    Welch, P.: The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Trans. on Audio Electroacoustics 15, 70–73 (1967)

  • [34] Wu, D., Chuang, C.H., Lin, C.T.: Online driver’s drowsiness estimation using domain adaptation with model fusion. In: Proc. Int’l Conf. on Affective Computing and Intelligent Interaction. pp. 904–910. Xi’an, China (September 2015)
  • [35]

    Wu, D., Lawhern, V.J., Gordon, S., Lance, B.J., Lin, C.T.: Offline EEG-based driver drowsiness estimation using enhanced batch-mode active learning (EBMAL) for regression. In: Proc. IEEE Int’l Conf. on Systems, Man and Cybernetics. pp. 730–736. Budapest, Hungary (October 2016)

  • [36] Wu, D., Lawhern, V.J., Gordon, S., Lance, B.J., Lin, C.T.: Spectral meta-learner for regression (SMLR) model aggregation: Towards calibrationless brain-computer interface (BCI). In: Proc. IEEE Int’l Conf. on Systems, Man and Cybernetics. pp. 743–749. Budapest, Hungary (October 2016)
  • [37] Wu, D., Lawhern, V.J., Gordon, S., Lance, B.J., Lin, C.T.: Driver drowsiness estimation from EEG signals using online weighted adaptation regularization for regression (OwARR). IEEE Trans. on Fuzzy Systems 25(6), 1522–1535 (2017)
  • [38] Wu, D., Lawhern, V.J., Lance, B.J., Gordon, S., Jung, T.P., Lin, C.T.: EEG-based user reaction time estimation using Riemannian geometry features. IEEE Trans. on Neural Systems and Rehabilitation Engineering 25(11), 2157–2168 (2017)