Human operators play an essential part in Human-Computer Interaction (HCI) to achieve better performance. Operators produce activities with standard input devices, a mouse and a keyboard, and retrieve outputs through their monitors to complete tasks that can be solved by interactions. The usage through input devices is the target of flourishing the quality of interaction between two different subjects, mainly understanding human states is the top priority. Human cognitive and affective states influence the performance of operators (Carroll, 1997). If operators are allocated several tasks persistently, the consecutive tasks may provoke leakage of cognitive capabilities (Debie et al., 2019), which would cause adverse effects on task completion. Researchers have carried out studies to measure cognitive workload in behavioral, physiological, and neurophysiological methods.
Memory in the cognitive process arguably explains the ability that previously acquired information is retrievable over time. (Cowan, 2008) described working memory as a kind of short-term memory. However, attainable human memory has a capacity limit, which leads people to choose tactical ways to maintain the information. Numerous human reactions have been investigated to measure a person’s workload from demonstrated research results that internal states of human and their reactions correlates (Morsella and Bargh, 2011; Rheem et al., 2018). Grounded from top-down approach that indicates humans concentrate on task-relevant stimuli selectively, and irrelevant ones to tasks will be suppressed otherwise (Gazzaley and Nobre, 2012), explicit variations of expressive behaviors, such as motor behaviors, have been elicited to be matched with cognitive load.
Human hand gestures have been used as a measure to gauge a person’s perception response from stimuli imitating genuine human-computer interaction. Mouse dynamics on task completion have mainly been scrutinized (Grimes and Valacich, 2015; Rheem et al., 2018; Freihaut and Göritz, 2021; Witte et al., 2021), and mutually meet agreements that mouse usage during task completion indicates the intensity of cognitive burden on humans. Higher cognitive load humans intrinsically undergo, less active mouse usage, such as pixel variation, speed, and click pressure occurs (Rheem et al., 2018; Grimes and Valacich, 2015; Witte et al., 2021). However, previous studies required participants to drag a mouse actively when completing given laboratory tasks. The task-irrelevant motor behaviors (Lau and Passingham, 2007) have not been deeply investigated. The task-related motor behaviors can be spotted during completion, but it is ambiguous whether the hand behaviors were derived directly from tasks or the internal process.
Studies on unintended behavioral reactions and cognitive loads are scarce. When human’s internal states change, facial expressions or motor behaviors occur because of given stimuli (Morsella and Bargh, 2011; Zimmermann et al., 2003). The changes happen unconsciously, but where the events can influence cognition or motor outcomes. These behaviors voluntarily happen as self-motor behaviors. For example, touching a hot surface makes us hands off right away. The process of human perception can also be observed as motor behavior beyond conscious awareness. Visual stimulation derives unconscious actions during humans process complex visual information (Maruya et al., 2007). Finger tapping when listening to music appears as a process of auditory stimulation (Maes et al., 2014). Given this finding, the question arises as to whether redundant hand gestures can represent a person’s cognitive load.
In this paper, we study the unconscious mouse movements of multiple operators performing the dual -back test. We investigate physical movements and self-questionnaire data between mouse usage and cognitive load when humans do not need to move their hands. We also analyze the attained data from the game, presuming humans would make redundant mouse movements when cognitive load gradually increases. We expect that the task-irrelevant mouse behaviors enable us to predict different mental demands.
2. Related Work
Mouse dynamics in HCI have been extensively explored as one of the human internal state indicators. Data from mouse dynamics vary, such as pixel variation, speed, and response time. (Rheem et al., 2018) conducted a user study to assess cognitive load providing participants experiments clicking distant and different sized circles. The researchers found that the reaction time and movement velocity in the highest cognitive condition deviates from the other two lower levels. Similarly, (Grimes and Valacich, 2015) made participants compare numbers and click the right button on the screen, and the level increased memorizing a few numbers. They found that longer distances and slower movements happened when higher cognitive loads were derived. Given the previous studies, we have selected mouse dynamics to scrutinize the relationship between human cognition and unconscious actions through a game that can derive cognitive load.
The field of human workload estimation has been extensively studied over decades. Cognitive load is ambiguous that cannot quantitatively measure the exact amount, which led researchers to find equations to derive the load(Sampei et al., 2016) utilizing NASA Task Load Index (NASA-TLX) (Hart and Staveland, 1988). The situation that humans take part in varies, including aircraft (Borghini et al., 2014; Peruzzini et al., 2019), automobile (Benedetto et al., 2011; Fridman et al., 2018), and human-robot interaction (Rabby et al., 2019). In various situations, no dominant methods were presented in the previous studies. Researchers have adopted single or multi modalities to predict mental load in various ways. Physiological or neurophysiological sensors are getting popular, while some researchers have raised that a non-intrusive system does not bother human reactions to measure mental load (Peruzzini et al., 2019; Cech and Soukupova, 2016). Behavioral measures, such as keystrokes or mouse tracking (Evans and Fendley, 2017; Freihaut et al., 2021) have been selected to predict mental load. The difference between behavioral measures and sensor-based ones is whether the measures perform given tasks directly. Sensor responses occur as physical reactions caused by the cognition process, while humans follow out a course of action connected to behavioral reaction. Load estimation varies from personal behavior patterns as well. In the perspective of behavioral measures, workload estimation is mainly related to reactions that are directly associated with information processing and task completion. The -back task has been widely adopted to derive cognitive load evaluate working memory capabilities (Jaeggi et al., 2010; Lawlor-Savage and Goghari, 2016; He et al., 2019). The dual -back gives two kinds of stimuli, visually and auditorily, to match previous -steps cue to the most recent one.
Unconscious actions in a psychological perspective are behavioral responses resulting from cognition and motor process that do not participate in conscious perception (Morsella and Bargh, 2011). Behaviors that people do not even recognize about themselves can also represent the cognitive process. The unconscious or voluntary actions can be indicators of the stimulation process. For example, a human shows processing states from visual motivation to their motor behaviors beyond human awareness (Maruya et al., 2007). Not only can visual stimulus derive human motor control, but auditory motives provoke physical activities. Small movements that appear when people listen to music are also caused by cognitive processes that occur intrinsically. Researchers analyze these actions in terms of social interactions as well as expressions (Maes et al., 2014), concluding body movements reflect the cognitive process that can be monitored beyond consciousness.
The background of cognitive workload estimation and unconscious action brought that task-irrelevant behaviors may predict the degree of mental workload. To measure the load using these unconscious actions and eye-related actions when completing tasks have been selected. We chose mouse movements, considering HCI situations, to see if unaware behaviors that can be monitored through body movements are also cues to estimate cognitive burden. As mouse activities have drawn attention to investigate affective states and stress in an unobtrusive way (Freihaut et al., 2021), it is uncertain that unconscious mouse behaviors have a similar effect as task-related behaviors do.
When experiencing cognitive load, mouse behaviors that are not relevant to task completion, or unconscious activity of the dual -back games will change following behaviors based on the three different levels: Frequency of movements (H1a), Moving duration (H1b), and Movement position changes; pixels (H1c).
We utilized the workload test set in the affective dataset (Jo et al., 2020) that collected human behavioral, physiological, and self-rating responses. The dataset includes 30 participants (11 females and 19 males: the age ranges from 18 to 37; mean: 25.1; std: 4.497). As Fig. 1 depicts how the ROSBag dataset looks like, the dataset provided a human facial video, recorded mouse data, and data of physiological sensors during experiments while playing the dual -back games. All streams of data can be retrieved by replaying pre-defined topics. For example, as shown in Fig. 1, mouse position changes during the experiment can be retrieved by using the ROS topic named /mouse_tracking/position. The ROS topic consists of structured data of coordinates with their timestamps. The dataset also provides the NASA-TLX self-questionnaire obtained after the dual -back game rounds. Consequently, the dataset allowed us to obtain the externally observable participant’s behaviors and the self-questionnaire results of cognitive load.
However, for the analysis of mouse behaviors, the data of 27 individuals were used because the data of three participants was only partially recorded across all topics. Data that can only be retrieved in part may compromise data integrity, hence three participants’ data were excluded. The data of 16 participants among 27 participants also included facial and figural videos of participants who did not wear eyeglasses. Glares interfered with analysis on eye behaviors since the OpenPose (Cao et al., 2019) could not successfully detect eye features in facial video streams. Therefore, the data of 16 participants were utilized in further validation steps.
The game exposes a cue during 3000 milliseconds, requiring players to remember the subject’s location or numbers that the participants saw or heard turns back. The dual 1-back game is represented in a low level, dual 2-back as a medium, and dual 3-back as a high one. Fig. 2 depicts the entire process of the user experiment with the dual -back task. Participants firstly saw a fixation cross ten seconds before starting actual workload inducement. After being exposed to the cross, the game continues until participants complete one round for 60 seconds. When the 60 seconds of the game completed, participants filled the NASA-TLX to answer mental demand, physical demand, temporal demand, performance, effort, and frustration levels.
Unlike the traditional way of answering two matches by clicking a button on the Graphic User Interface (GUI) or hitting keyboard (Jaeggi et al., 2010; Lawlor-Savage and Goghari, 2016) as shown in Fig. 3 (a), this study was designed to demonstrate the link between cognitive load and unconscious behavior by requiring only a mouse click to enter the correct answer. Therefore, moving the mouse was not necessary to play this -back game other than stopping the experiment (a stop button is located at the bottom). This means that moving mouse behavior is not relevant to the game completion. As shown in Fig. 3 (b), the GUI included no buttons to indicate the position and audio matches. The game only lets participants click the left and right buttons of a mouse to indicate position or audio matches. For example, participants need to click on the right mouse button when an auditory cue does match with steps before. In the GUI, the strings ‘Position Match’ and ‘Audio Match’ give feedback from participants’ answers, and those strings are not buttons to indicate answers. When responses took place with the left button on the mouse and it does not match with the previous -step position cue, the color of ‘Position Match’ characters turns to red giving negative feedback to the participant. The demo video of playing the dual -back game used for this study can be viewed through the external video: https://youtu.be/ZsUDPl5Yr88.
3.1. Mouse Data Preprocessing
The affective dataset provides the ROS-topic (/mouse_tracking/position) to retrieve mouse behaviors with time information when participants move their mouse during the user study. Experimental data was used when participants played the dual -back game, which was about 60 seconds. As shown in the top of the Fig. 4, when the ROS environment reproduced the experiment environment, the user’s mouse position data and movement were extracted with ROS time information following saved discrete topics that have and positions with time data indicating the participants moved the mouse cursor to the saved position. We extracted three pieces of information by ROS time and mouse position information: movement frequency, movement duration, and pixel position change. If two positions were extracted from the ROSbag file, as shown in the bottom of the Fig. 4, moving frequency is recorded. Movement duration was extracted from ROS time recorded with the position data. The pixel position changes were calculated by the distance between two mouse cursor location data.
The 27 participants were chosen among 30 sets excluding the damaged three sets of data. Firstly, extracted mouse trajectories were drawn in images to show how the participants’ unconscious movements are distributed. The trajectories show to what extent individuals move the cursor during the experiments. All mouse trajectories are presented in the supplementary video. One participant did not show any movements during the dual 3-back game, while three participants showed a fixed mouse in the 1-back game.
Fig. 5 shows mouse trajectories of two participants in the dataset. The participant P21 showed decreasing movements throughout the experiment (see the top three images). In contrast, the participant P27 showed increasing movements as the level increases (see the bottom three images). A spotted point in trajectory drawings is that human unconscious movements appear differently. The mouse-related behaviors may appear in the opposite direction as the level increases.
to prove variance among levels statistically. The IBM SPSS Statistics software tested 81 sets of mouse tracking. The one-way ANOVA test aims to distinguish three individual levels that are statistically different in 95% confidence interval. First, we analyze data descriptively, means and standard deviation values. Each mouse measure is tested through the ANOVA to find which measure is significant. If any behavior pattern is confirmed, the three levels are compared with the valid method.
Table 1 shows mean and standard deviation values of measures. Mouse movement frequency rates were similar among levels (Low: 66.48, Medium: 66.48, High: 67.40 counts). The pixel changes in the high level were the highest. However, the standard deviation shows the variance among participants was also high. The descriptive result in pixel changes shows some participants did not move the mouse wide, while others changed the cursor broadly. Movement duration was longer in the high-level game. The rmANOVA compared the three levels of mouse measure. Table 2 represents that the movement duration and pixel changes did not pass the test. Measuring movement frequency showed the significance to differentiate three game levels (H1a accepted: , <.05). The ANOVA test also compared the frequency data between levels, represented in Table 3. The one-way ANOVA showed the low and high workload showed a significant difference, while frequent movements at medium level cannot be distinguished from the previous or increasing workload. The test result means that participants unconsciously reacted differently in low and high levels shown in mouse movements. Also, the result could signify that participants consecutively moved the mouse more frequently as cognitive level increases.
|Movement Frequency (count)||Low||66.48||11.164|
|Movement duration (milliseconds)||Low||3.27||3.478|
|Movement position change (pixels)||Low||14.70||28.496|
The table represents the descriptive analysis results of mouse movement frequency, movement duration, and position change.
|Movement Frequency (count)||3.673||.032 ( < )|
|Movement duration (milliseconds)||1.699||.193 ( >)|
|Movement position change (pixels)||1.711||.191 ( >)|
The table represents the rmANOVA test results of mouse movement frequency, movement duration, and pixel position change. Movement frequency showed statistical significance.
|Level comparison||F(1, 26)||Significance|
|Low & Medium||.000||1.000 ( >)|
|Low & High||5.286||.030 ( < )|
|Medium & High||4.016||.056 ( >)|
Mouse movement frequency between levels is analyzed. The low and high levels showed statistical significance.
5. Data Validation
We validate the hypothesis that position changes in mouse usages have a relationship with the levels of cognitive load, through the subjective responses and behavioral measures. The NASA-TLX self ratings and eye-related behavior were obtained together from the affective dataset (Jo et al., 2020). The dataset provided a recorded video stream through ROS topic /camera/color/image_raw with mouse tracking data at the same time. The self-rating responses were given along with ROSBag files.
We examined the dual -back score and the NASA-TLX results to determine the difference between the three levels of cognition. Fig. 6 shows the game score and NASA-TLX ratings of 27 participants. -back game scores decreased as the game level rises, as the blue color bar shows in the figure. The average values of mental workload, temporal demand, performance, effort, and frustration increased as the game became more challenging. Table 4 presents comparison results of ratings gathered in NASA-TLX answers and game scores. The physical demand ratings were not significant between levels, which means individuals did not observe physical needs. The main difference between mouse data and the NASA-TLX is that the self-rating showed the difference between low and medium workload levels.
|Game score||35.200||.000 ( < )|
|Mental demand||35.490||.000 ( < )|
|Physical demand||2.850||.067 ( >)|
|Temporal demand||14.137||.000 ( < )|
|Performance||28.939||.000 ( < )|
|Effort||18.838||.000 ( < )|
|Frustration||31.824||.000 ( < )|
NASA-TLX ratings of three different levels of the dual n-back games are analyzed with the rmANOVA test. Most of ratings, other than physical demand, showed statistical significance.
5.2. Eye blinking Duration
Eye blinking is one of the unconscious behaviors as mentioned in Section 2. We analyzed the eye blinking frequency and blinking duration, because they are effective indicators for measuring cognitive load (Benedetto et al., 2011)
. The used dataset captured participants’ eye-blinking behavior with a front-facing camera within the experimental environment. In order to detect the blinking of the eyes, we utilized an open-source library, OpenPose(Cao et al., 2019)
, that uses deep-learning estimation to capture seven landmarks on each eye region. Eye blinking was detected by
using the landmark detected in the eye area (Cech and Soukupova, 2016). Each landmark used in the eye aspect ratio (EAR) calculation is depicted in Fig. 7. If the value of the EAR exceeds the given threshold, 0.2, the eye blinking events were recorded. The eye closing time of participants was analyzed in milliseconds. Among the participants, 16 sets of data were acceptable to be analyzed due to glares on eyeglasses as mentioned in Section 3.
|Blink frequency||0.038||( >)|
|Blink duration||3.50||.043 ( < )|
Eye blinking behavior data are also analyzed with the rmANOVA analysis. Eye blinking duration showed statistical significance.
|Level comparison||F(1, 15)||Significance|
|Low & Medium||1.559||.231 ( >)|
|Low & High||6.699||.021 ( < )|
|Medium & High||2.015||.176 ( >)|
Eye blinking duration data between levels are also analyzed. The low and high levels showed statistical significance.
One-way rmANOVA is utilized with eye blinking frequency and duration to examine if three separate levels are distinctive. Among two eye-related measures, eye-opening duration was also feasible to detect different levels of mental load (, <.05), as shown in Table 5. As the same procedure as we analyzed the valid measure of mouse behavior, we compared the duration of the levels. Table 6 shows that the low and high levels are different from the eye-opening cues of participants. The analysis results from eye blinking duration comply with mouse behavior analysis, especially from mouse moving frequency.
We determined that unconscious mouse actions are correlated to human cognitive workload. The statistical analysis confirmed that the frequency of mouse movement representing how redundant behavior distinguishes the different workload levels. The moving duration and changing the location of mouse usage were incongruent. We validated the mouse-related data analysis with the self-questionnaire answers related to human workload, the NASA-TLX, and eye blinking behavior. Eye blinkings were analyzed with a smaller portion of the dataset due to the image distortion and glitches on the eyeglasses of participants. Interestingly, however, differentiating human cognitive workload by two kinds of unconscious behaviors was valid. Two measures, mouse movement frequency and eye blinking duration showed their significance in distinguishing low and high mental load levels. The self-ratings support the dual -back game successfully derived the different cognitive load levels.
Three questions to consider regarding the dataset existed. First, the dual -back game allows users to click mouse buttons, but the number of clicking varies depending on the game level. Trials occur 20 times for 60 seconds in the dual 1-back and 18 possible events in dual 3-back. The mouse data analysis in Section 4 showed the almost equivalent frequency among the levels, which concluded that the number of trials does not correlate with mouse cursor changes caused. Second, considering that the mouse-related topic is recorded when the cursor moves, the ROS system can also save slight movements. Click events may generate slight movements when the participants responded, but the mouse trajectory data were recorded randomly than regularly or linearly. However, recorded mouse movements with little change were written when the test subject moved the mouse once or twice during the experiment. Therefore, the movement that may occur when the mouse is pressed has not been considered. Finally, the experimental GUI had a stop button that could also derive task-irrelevant mouse trajectories. However, trajectories among levels did not record any trace around the area of the stop button. The stop button did not influence on experiments.
This study assumed that unconscious mouse behavior is derived from cognitive states. However, affective states may influence the participants to make redundant behaviors with the mouse. The cognitive workload task in the affective dataset provides the NASA-TLX questionnaire but does not include the SAM Rating (Bradley and Lang, 1994) considering the participants’ affective states. Therefore, we may need to verify that the mental load causes the unconscious behavior.
7. Conclusion and Future Work
We concluded that unconscious mouse actions and human cognitive workload are correlated. This study hypothesized that unconscious mouse movements that are not related to given tasks differentiate different levels of human cognitive load. Analyzing mouse movement frequency confirmed that our hypothesis is valid; in other words, human cognitive load can be predicted by watching how many times individuals change the location of mouse unconsciously. However, the relationship between unconscious usage and the participants’ affective state was not confirmed. As future work, the redundant mouse actions should be investigated whether affective states or cognitive states result in the actions.
Acknowledgements.This material is based upon work supported by the National Science Foundation under Grant No. IIS-1846221. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Driver workload and eye blink duration. Transportation research part F: traffic psychology and behaviour 14 (3), pp. 199–208. Cited by: §2, §4, §5.2.
- Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue and drowsiness. Neuroscience & Biobehavioral Reviews 44, pp. 58–75. Cited by: §2.
- Measuring emotion: the self-assessment manikin and the semantic differential. Journal of behavior therapy and experimental psychiatry 25 (1), pp. 49–59. Cited by: §6.
-  Dual n-back training demo mode. External Links: Cited by: Figure 3.
OpenPose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43 (01), pp. 172–186. Cited by: §3, §5.2.
- Human-computer interaction: psychology as a science of design. Annual review of psychology 48 (1), pp. 61–83. Cited by: §1.
- Real-time eye blink detection using facial landmarks. Cent. Mach. Perception, Dep. Cybern. Fac. Electr. Eng. Czech Tech. Univ. Prague, pp. 1–8. Cited by: §2, §5.2.
- What are the differences between long-term, short-term, and working memory?. Progress in brain research 169, pp. 323–338. Cited by: §1.
- Multimodal fusion for objective assessment of cognitive workload: a review. IEEE transactions on cybernetics 51 (3), pp. 1542–1555. Cited by: §1.
- A multi-measure approach for connecting cognitive workload and automation. International Journal of Human-Computer Studies 97, pp. 182–189. Cited by: §2.
- Tracking stress via the computer mouse? promises and challenges of a potential behavioral stress marker. Behavior Research Methods 53 (6), pp. 2281–2301. Cited by: §2, §2.
- Using the computer mouse for stress measurement–an empirical investigation and critical review. International Journal of Human-Computer Studies 145, pp. 102520. Cited by: §1.
- Cognitive load estimation in the wild. In Proceedings of the 2018 chi conference on human factors in computing systems, pp. 1–9. Cited by: §2.
- Top-down modulation: bridging selective attention and working memory. Trends in cognitive sciences 16 (2), pp. 129–135. Cited by: §1.
- Mind over mouse: the effect of cognitive load on mouse movement behavior. In 2015 International Conference on Information Systems: Exploring the Information Frontier, ICIS 2015, Cited by: §1, §2.
- Development of nasa-tlx (task load index): results of empirical and theoretical research. In Advances in psychology, Vol. 52, pp. 139–183. Cited by: §2.
- High cognitive load assessment in drivers through wireless electroencephalography and the validation of a modified n-back task. IEEE Transactions on Human-Machine Systems 49 (4), pp. 362–371. Cited by: §2.
- The relationship between n-back performance and matrix reasoning—implications for training and transfer. Intelligence 38 (6), pp. 625–635. Cited by: §2, §3.
- ROSbag-based multimodal affective dataset for emotional and cognitive states. In 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 226–233. Cited by: Figure 1, Figure 3, §3, §5.
- Unconscious activation of the cognitive control system in the human prefrontal cortex. Journal of Neuroscience 27 (21), pp. 5805–5811. Cited by: §1.
- Dual n-back working memory training in healthy adults: a randomized comparison to processing speed training. PloS one 11 (4), pp. e0151817. Cited by: §2, §3.
- Action-based effects on music perception. Frontiers in psychology 4, pp. 1008. Cited by: §1, §2.
- Voluntary action influences visual competition. Psychological Science 18 (12), pp. 1090–1098. Cited by: §1, §2.
- Unconscious action tendencies: sources of” un-integrated” action.. pp. 335–347. Cited by: §1, §1, §2.
- Transdisciplinary design approach based on driver’s workload monitoring. Journal of Industrial Information Integration 15, pp. 91–102. Cited by: §2.
- An effective model for human cognitive performance within a human-robot collaboration framework. In 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3872–3877. Cited by: §2.
- Use of mouse-tracking method to measure cognitive load. In Proceedings of the human factors and ergonomics society annual meeting, Vol. 62, , pp. 1982–1986. External Links: Cited by: §1, §1, §2.
- Mental fatigue monitoring using a wearable transparent eye detection system. Micromachines 7 (2), pp. 20. Cited by: §2.
- Analysis of repeated measurement data in the clinical trials. Journal of Ayurveda and integrative medicine 4 (2), pp. 77. Cited by: §4.
- Measuring cognitive load for adaptive instructional systems by using a pressure sensitive computer mouse. In International Conference on Human-Computer Interaction, Cham, pp. 209–218. Cited by: §1.
- Affective computing—a rationale for measuring mood with mouse and keyboard. International journal of occupational safety and ergonomics 9 (4), pp. 539–551. Cited by: §1.