Human operator cognitive availability aware Mixed-Initiative control

by   Giannis Petousakis, et al.
University of Birmingham

This paper presents a Cognitive Availability Aware Mixed-Initiative Controller for remotely operated mobile robots. The controller enables dynamic switching between different levels of autonomy (LOA), initiated by either the AI or the human operator. The controller leverages a state-of-the-art computer vision method and an off-the-shelf web camera to infer the cognitive availability of the operator and inform the AI-initiated LOA switching. This constitutes a qualitative advancement over previous Mixed-Initiative (MI) controllers. The controller is evaluated in a disaster response experiment, in which human operators have to conduct an exploration task with a remote robot. MI systems are shown to effectively assist the operators, as demonstrated by quantitative and qualitative results in performance and workload. Additionally, some insights into the experimental difficulties of evaluating complex MI controllers are presented.



There are no comments yet.


page 3


Mixed-Initiative variable autonomy for remotely operated mobile robots

This paper presents an expert-guided Mixed-Initiative (MI) variable-auto...

Fessonia: a Method for Real-Time Estimation of Human Operator Workload Using Behavioural Entropy

This paper addresses the problem of the human operator cognitive workloa...

Trust, Shared Understanding and Locus of Control in Mixed-Initiative Robotic Systems

This paper investigates how trust, shared understanding between a human ...

A Bayesian-Based Approach to Human Operator Intent Recognition in Remote Mobile Robot Navigation

This paper addresses the problem of human operator intent recognition du...

Improving Human Performance Using Mixed Granularity of Control in Multi-Human Multi-Robot Interaction

Due to the potentially large number of units involved, the interaction w...

Human-in-the-Loop Mixed-Initiative Control under Temporal Tasks

This paper considers the motion control and task planning problem of mob...

Flexible Disaster Response of Tomorrow -- Final Presentation and Evaluation of the CENTAURO System

Mobile manipulation robots have high potential to support rescue forces ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In this paper we follow a Variable Autonomy (VA) approach in order to improve Human-Robot Systems (HRS) that are being increasingly used in high risk, safety-critical applications such as disaster response. VA in this context refers to systems with autonomous capabilities of varying scales.

The task of manually controlling the robot combined with various operator straining factors, increase the cognitive workload of the operator [1]. Moving towards robotic systems that can actively assist the operators and reduce their workload can lead in reduced errors and improved performance [1, 2]

. The advantage of such HRS lies in the complementing capabilities of humans and Artificial Intelligence (AI). In a VA system control can be traded between a human operator and a robot by switching between different Levels of Autonomy

[3], e.g. the robot can navigate autonomously while the operator is multitasking. Levels of Autonomy (LOAs) refer to the degree to which the robot, or any artificial agent, takes its own decisions and acts autonomously [4]. This work addresses the problem of dynamically changing LOA during task execution using a form of VA called Mixed-Initiative control. Mixed-initiative (MI) refers to a VA system where both the human operator and the robot’s AI have authority to initiate actions and to change the LOA [5].

In many VA systems found in literature the LOA is chosen during the initialization of the system and cannot change on the fly [6, 7]. Other systems allow only the operator to initiate actions, although these are based on system’s suggestions [8]. Hence, the agents lack the ability to override each other’s actions. Additionally, the robot’s AI initiative is often limited within a specific LOA, e.g. in safe mode [7]. Research on MI systems that are able to switch LOA dynamically is fairly limited. Moreover, some of the MI systems proposed are not experimentally evaluated e.g. [9].

Our previous work [10] proposed a novel AI ”expert-guided” MI controller and identified some major challenges: the conflict for control between the operator and the robot’s AI and the design of context aware MI controllers. In this paper we extend the MI controller proposed in [10] in order to tackle the above mentioned challenges by using information on the cognitive availability of the human operator and by improving the controller’s assumptions. We call our approach Cognitive Availability Aware Mixed-Initiative (CAA-MI) control. Cognitive availability indicates if the operator’s attention is available to focus on controlling the robot.

In our work, cognitive availability inference is based on the operator’s head pose estimation provided by a state-of-the-art deep learning computer vision algorithm. A low cost, off-the-self webcamera mounted on the Operator Control Unit (OCU) provides video streaming to the computer vision algorithm in real time. The head pose estimation provides input to the fuzzy CAA-MI controller where the cognitive availability inference and the LOA switching decisions are made. Somewhat related to our paper is the work of Gateau et al.

[8] which uses cognitive availability to inform decisions on asking the operator for help. Gateau et al. [8] uses a specialized eye tracker and does not involve cognitive availability informed LOA switching.

This work is explicitly making use of operator’s cognitive availability in MI controller’s LOA switching decision process and is contributing by: a) extending the MI controller in [10] to explicitly include cognitive availability and active LOA information; b) including those parameters provides a qualitative advancement that tackles assumptions of the original controller; c) providing proof of concept on using state-of-the-art computer vision methods for informing MI control on operator’s status.

Ii Expert-guided Mixed-Initiative control with cognitive availability inference

The MI control problem this paper addresses is allowing dynamic LOA switching by either the operator or the robot’s AI towards improving the HRS performance. In this work we assume a HRS which has two LOAs: a) teleoperation, where the human operator has manual control over the robot via a joystick; b) autonomy, where the operator clicks on a desired location on the map with the robot autonomously executing a trajectory to that location.

Operator’s initiated LOA switches are based on their judgment. Previous work [3, 11] showed that humans are able to determine when they need to change LOA in order to improve performance. AI initiated LOA switches are based on a fuzzy inference engine. The controller’s two states are: a) switch LOA; b) do not switch LOA. The controller uses four input variables: a) an online task effectiveness metric for navigation, the goal-directed motion error; b) the cognitive availability information provided via the deep learning computer vision algorithm; c) the currently active LOA; d) the current speed of the robot. Please refer to Figure 1 for the block diagram of the system.

Fig. 1: The block diagram of the CAA-MI control system.

Ii-a Goal-directed motion error

The CAA-MI controller uses an expert-guided approach to initiate LOA switches. It assumes the existence of an AI task expert that given a navigational goal can provide the expected task performance. The comparison between the run-time performance of the system with the expected expert performance yields an online task effectiveness metric. This metric expresses how effectively, the system, performs the navigation task.

The controller’s task effectiveness metric is the goal-directed motion error (refereed to as “error” from now on) and is the difference between the robot’s current motion (i.e. speed in this case) and the motion of the robot required to achieve its goal according to an expert. To provide more context we extract the expert motion performance from a concurrently active navigation system which is given an idealized (unmapped obstacle and noise-free) view of the robot’s world. In essence, the expert provides an idealized model of possible robot behavior that can be seen as an upper bound on system performance.

Additionally, our controller encodes expert knowledge from human operators data learned using machine learning techniques from a previous experiment

[3] on: a) what is considered to be a large enough error to justify a LOA switch; and b) the time window in which that error needs to accumulate. For detailed information the reader is encouraged to read our previous work [10].

Ii-B Cognitive availability via head pose estimation

In this work we consider the information on whether the operator is attending or not at the Graphical User Interface (GUI) to be directly relevant to his current cognitive availability. The CAA-MI controller presented here infers the operator’s cognitive availability by monitoring the operator’s head pose via a computer vision algorithm. By introducing cognitive availability perception capabilities to the HRS system, we make a qualitative change to the robot’s ability to perceive information about the operator and his availability to control the robot.

By using a computer vision technique for inferring the cognitive availability of the operator we avoid intrusive methods such as biometric sensors (e.g. EEG) and the use of cumbersome apparatus. Our aim is to keep the approach generalizable and versatile. The latest head pose estimation techniques are robust in real time use and do not require specialized image capturing equipment [12, 13], contrary to most state-of-the-art gaze tracking implementations (e.g. [14]). We chose head pose estimation over gaze tracking preferring perception of more pronounced natural cues.

Our focus is on using a robust CV algorithm to provide cognitive availability input to the controller rather than evaluating the various CV algorithms. The Deepgaze algorithm [12]

is used in our system for head pose estimation. It makes use of state-of-the-art methods such as a pre-trained Convolutional Neural Network (CNN), is open source and actively maintained. In our implementation, deepgaze measures the operator’s head rotation on the vertical axis (yaw). The head rotation was filtered through an exponential moving average (EMA)

[15] in order to reduce noise in the measurements. The parameters of the EMA were calculated by analyzing data regarding the attention of the operator in the secondary task in previous experiments. Lastly, baseline data on what values of rotation constitute attending or not at the GUI were gathered in a pilot experiment and in a standardized way. These were mapped into the fuzzy input for cognitive availability.

Ii-C The fuzzy rule base

A fuzzy bang-bang controller is used with the Largest of Maxima defuzzification method. We have introduced the active LOA and the Cognitive availability of the operator as fuzzy inputs. A hierarchical fuzzy approach is used for the CAA-MI controller, meaning that the first in order rule activated has priory over the others. The cognitive availability of the operator and the active LOA take precedence in the fuzzy rule activated. The rules added to the original MI controller make sure that when the operator is not attending the screen the robot will operate in autonomy, with the rest of the rule base architecture being similar to the previous MI controller.

Iii Experimental evaluation

This experiment evaluates the CAA-MI controller and investigates its potential advantages over the previously built MI controller [10] (referred simply as MI controller for the rest of the paper). The Gazebo simulated robot was equipped with a laser range finder and a RGB camera. The software was developed in Robot Operating System (ROS) and is described in more detail in [10]. The ROS code for the CAA-MI controller is provided under MIT license in our repository [16]. The robot was controlled via an Operator Control Unit (OCU) (see Figure 2) including a GUI (Figure 3). The experiment’s test arena was approximately . The software used for the secondary task was the OpenSesame [17] and the images used as stimuli were previously validated for mental rotation tasks in [18].

Fig. 2: 2: The experimental apparatus: composed of a mouse and a joystick, a laptop and a desktop computer, a screen showing the GUI; a web-camera mounted on the screen; and a laptop presenting the secondary task. 2: A typical example of the secondary task.
Fig. 3: The GUI: left, video feed from the camera, the LOA in use and the status of the robot; right, the map showing the position of the robot, the current goal (blue arrow), the AI planned path (green line), the obstacles’ laser reflections (red) and the walls (black). The letters denote the waypoints to be explored.

The participants were tasked with a primary exploration task and a cognitively demanding secondary task. The primary task was the exploration of the area along a set of waypoints in order to identify the number of cardboard boxes placed on those waypoints. In the secondary task the operators were presented with pairs of 3D objects. The operators were required to verbally state whether or not the two objects were identical or mirrored.

Artificially generated sensor noise was used to degrade the performance of the autonomous navigation. The secondary task was used to degrade the performance of the operator. Each of these performance degrading situations occurred once concurrently (i.e. overlapping) but at random.

A total of 8 participants participated in a within-groups design in which every participant performed two identical trials, one with each controller: the CAA-MI controller and the MI controller. Counterbalancing was used in the order of the two controllers for different participants. Each participant underwent extensive standardized training ensuring a minimum skill level and understanding of the system. Participants were instructed to perform the primary task as quickly and safely as possible. They were also instructed that when presented with the secondary task, they should do it as quickly and as accurately as possible. They were explicitly told that they should give priority to the secondary task over the primary task and should only perform the primary task if the workload allowed. Lastly, after every trial, participants completed a raw NASA-TLX workload questionnaire.

Iv Results

Iv-a Statistical analysis

The CAA-MI controller displayed a trend, though not statistically significant (Wilcoxon signed-rank tests and t-tests), of improved performance over the MI controller in terms of lower number of LOA switches, more secondary task number of correct answers and less perceived workload (NASA-TLX), as demonstrated in table


metric descriptive statistics
primary task
completion time
MI: ,
secondary task no.
of correct answers
MI: ,
Baseline: ,
secondary task
accuracy (% of
correct answers)
MI: ,
Baseline: ,
(Lower score means
less workload)
MI: ,
number of
LOA switches
MI: ,
number of AI
LOA switches
MI: ,
TABLE I: Table showing descriptive statistics for all the metrics.

Iv-B Discussion

Performance towards the primary task was on the same level for both controllers. This can be attributed to the varied exploration strategies of the operators as supported by anecdotal evidence. Additionally, it is possible that the performance degradation periods did not last enough to amplify the advantages of the CAA-MI controller compared to the MI one. By increasing secondary task duration and complexity, clearer insight to the advantages of the CAA-MI controller can be provided. Given the above evidence, a trade-off must be found between having a meaningful exploration task (i.e. to be used for system evaluation) and minimizing variance in operator strategies.

During MI controller trials and while the secondary task overlapped with the sensor noise, the robot would switch from autonomy to teleoperation. This negatively affected operators’ focus on the secondary task with some of them expressing their frustration. In contrast, this was not observed in the case of the CAA-MI controller since the robot could perceive that the operator was unavailable and hence maintained, or switched to, autonomy. Supporting evidence can be found in the NASA-TLX frustration scale which while not statistically significant, showed lower frustration in CAA-MI () compared to MI (). Participants also performed better on the secondary task. In conjunction with the above, a trend for lower LOA switches has been observed in the case of the CAA-MI controller. This is a possible indication of more efficient LOA switching compared to the MI as the operators relied more on the CAA-MI system’s capabilities.

Lastly, note that our previous work [3, 10] has already demonstrated that dynamic LOA switching (e.g. MI control) is significantly better than pure teleoperation or/and pure autonomy. Hence, it can be inferred that the CAA-MI controller is also advantageous over pure teleoperation or pure autonomy.

V Conclusions and future work

This paper presented a Cognitive Availability Aware Mixed-Initiative (CAA-MI) controller and its experimental evaluation. The CAA-MI controller’s advantage lies with the use of operator’s cognitive availability status into the AI’s LOA switching decision process.

Research in MI control often requires integrating different advanced algorithms in robotics and AI. Thus, the use of a state-of-the-art deep learning computer vision algorithm, demonstrated proof of concept that such algorithms can provide reliable and low cost advancements to MI control.

Experimental results showed that the CAA-MI controller performed at least as good as the MI controller in the primary exploration task. Also trends found in the results (e.g. NASA-TLX frustration scale, number of LOA switches) and anecdotal evidence indicate a more efficient LOA switching in certain situations compared to the MI controller. This is due to the qualitative advantage that the operator’s cognitive status information gives to the CAA-MI compared to the MI controller.

Concluding, we identify the current experimental paradigms as a limitation in testing the full potential of more complex MI systems. This is despite the recent advances on experimental frameworks [3]. Evaluating MI Human-Robot Systems in realistic scenarios involves difficult intrinsic confounding factors that are hard to predict or overcome. Hence, further research should devise a more complex experimental protocol along the lines proposed in the discussion, i.e. finding a trade-off between realism and meaningful scientific inference.


This work was supported by the following grants of UKRI-EPSRC: EP/P017487/1 (Remote Sensing in Extreme Environments); EP/R02572X/1 (National Centre for Nuclear Robotics); EP/P01366X/1 (Robotics for Nuclear Environments). Stolkin was also sponsored by a Royal Society Industry Fellowship.


  • [1] J. L. Casper and R. R. Murphy, “Human-Robot Interactions During the Robot-Assisted Urban Search and Rescue Response at the World Trade Center,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 33, no. 3, pp. 367–385, 2003.
  • [2] H. A. Yanco, A. Norton, W. Ober, D. Shane, A. Skinner, and J. Vice, “Analysis of Human-robot Interaction at the DARPA Robotics Challenge Trials,” Journal of Field Robotics, vol. 32, no. 3, pp. 420–444, 2015.
  • [3] M. Chiou, R. Stolkin, G. Bieksaite, N. Hawes, K. L. Shapiro, and T. S. Harrison, “Experimental analysis of a variable autonomy framework for controlling a remotely operating mobile robot,” IEEE International Conference on Intelligent Robots and Systems, pp. 3581–3588, 2016.
  • [4] T. B. Sheridan and W. L. Verplank, “Human and computer control of undersea teleoperators,” MIT Man-Machine Systems Laboratory, 1978.
  • [5] S. Jiang and R. C. Arkin, “Mixed-Initiative Human-Robot Interaction: Definition , Taxonomy , and Survey,” in IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2015, pp. 954–961.
  • [6] C. W. Nielsen, D. A. Few, and D. S. Athey, “Using mixed-initiative human-robot interaction to bound performance in a search task,” in IEEE International Conference on Intelligent Sensors, Sensor Networks and Information Processing, 2008, pp. 195–200.
  • [7] D. J. Bruemmer, D. A. Few, R. L. Boring, J. L. Marble, M. C. Walton, and C. W. Nielsen, “Shared Understanding for Collaborative Control,” IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 35, no. 4, pp. 494–504, 2005.
  • [8] T. Gateau, C. P. C. Chanel, M.-h. Le, and F. Dehais, “Considering Human’s Non-Deterministic Behavior and his Availability State When Designing a Collaborative Human-Robots System,” in IEEE International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 4391–4397.
  • [9] D. J. Bruemmer, J. L. Marble, D. D. Dudenhoeffer, M. O. Anderson, and M. D. Mckay, “Mixed-initiative control for remote characterization of hazardous environments,” in 36th Annual Hawaii International Conference on System Sciences, 2003, pp. 9 pp.–.
  • [10] M. Chiou, N. Hawes, and R. Stolkin, “Mixed-Initiative variable autonomy for remotely operated mobile robots,” arXiv preprint 1911.04848, 2019. [Online]. Available:
  • [11] M. Chiou, G. Bieksaite, N. Hawes, and R. Stolkin, “Human-Initiative Variable Autonomy: An Experimental Analysis of the Interactions Between a Human Operator and a Remotely Operated Mobile Robot which also Possesses Autonomous Capabilities,” AAAI Fall Symposium Series: Shared Autonomy in Research and Practice, pp. 304–310, 2016.
  • [12]

    M. Patacchiola and A. Cangelosi, “Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods,”

    Pattern Recognition, vol. 71, pp. 132–143, 2017.
  • [13] N. Ruiz, E. Chong, and J. M. Rehg, “Fine-grained head pose estimation without keypoints,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 2018-June, pp. 2155–2164, 2018.
  • [14] J. Lemley, A. Kar, A. Drimbarean, and P. Corcoran, “Convolutional neural network implementation for eye-gaze estimation on low-quality consumer imaging systems,” IEEE Transactions on Consumer Electronics, vol. 65, no. 2, pp. 179–187, 2019.
  • [15] R. G. Brown, Smoothing, forecasting and prediction of discrete time series.   Courier Corporation, 1963.
  • [16] G. Petousakis and M. Chiou, “Cognitive availability aware mixed-initiative controller code,”, 2019.
  • [17] S. Mathôt, D. Schreij, and J. Theeuwes, “OpenSesame: An open-source, graphical experiment builder for the social sciences,” Behavior Research Methods, vol. 44, no. 2, pp. 314–324, 2012.
  • [18] G. Ganis and R. Kievit, “A New Set of Three-Dimensional Shapes for Investigating Mental Rotation Processes : Validation Data and Stimulus Set,” Journal of Open Psychology Data, vol. 3, no. 1, 2015.