Teleoperating a robot allows operators to carry out tasks remotely with the robot as a medium while viewing its live video feedback. This indirect interaction brings in many advantages including increased motion precision and strength, and remote access to work fields that might be inaccessible or hazardous to the operator [30, 2, 14]. However, successfully teleoperating the robot for a task is often difficult and complex due to indirect and often oriented visualization, indirect manipulation with the robot, and physical discrepancies between a human hand and robot hand [27, 9]. Due to those difficulties, the operator easily feels lost in the visual feedback of the work field and has a difficult time figuring out how to operate the control interface to achieve desired robot motions. In the meantime, the above control process often associates with high mental and physical workload. Ways that can reduce the teleoperation complexity are being actively investigated [24, 19, 33].
Increasing robots’ intelligence and autonomy level to allow them to generate (semi-)autonomous behaviors and proactively assist in achieving the operator’s goal has demonstrated great advantages [6, 12, 18]. This shared control promotes the role of the robot from a passive motion follower or executor to a collaborative partner that shares in the control of the physical components of the system. Shared control leverages both strengths of the human’s adaptability for decision making in dynamic, uncertain environments and the robot’s automation capability for accomplishing a task faster, easier, and decreasing the physical and mental demands on the human [22, 11, 5].
Shared control allocates the relative amount of control power between the operator and robot based on a predefined arbitration policy, and definition of that policy has always been one of the fundamental problems. A proper arbitration policy gives proper amount of control power to each party at the correct time to maximize their advantages and minimize their disadvantages. Due to the lack of theoretical support, most arbitration policies have been artificially defined by researchers based on subjective intuition and have resulted in great varieties, including conflicting policies and results.
Here, we attribute the conflicted policy principles and policy varieties to the lack of comprehensive consideration of the multi-source uncertainty in the (semi)autonomous robotic system. Two main types of uncertainty are uncertainty in human intent understanding and uncertainty in automation execution. The first type is a result of the ambiguous human motion, cluttered environment, and imperfect intent inference algorithm; while, the second type is a combination of the sensing uncertainty, control uncertainty, and hardware uncertainty. The existence of this uncertainty that is not considered can result in misestimating the autonomous system’s capability of providing effective assistance, leading to inappropriate control allocation between the human and the robot and then task failures, performance decrease, and human resistance.
The effective shared control requires to allocate the appropriate amount of control power to the human and robot corresponding to the various uncertainty conditions. For practical deployment of effective shared control, we model the multi-source uncertainty of various types and levels and investigate how those types of uncertainty influence the allocation process. Particularly, our major contributions are:
A general uncertainty model that models the multi-source uncertainty at various magnitude in a shared-control robotic system.
A general arbitration model that copes with multi-source uncertainty and leverages the strength of human operators and robotic agents.
Two objective and quantitative metrics that evaluate a robotic agent’s helpfulness and friendliness at both micro and macro levels during human-robot shared control.
Ii Related Work
Shared-control lies between manual control from human operators and fully autonomous control by an intelligent-sufficient robotic agent . Its introduction and popularity are results of the great need to assist the human operator and facilitate the complexity and difficulty in manually teleoperating a robotic platform. Effectively allocating the control power between the operator and robotic agent has always been a key research question. Many researchers have followed a principle that assigns more control power to the robot when it is closer to the target [6, 1, 29]. This policy suggests that while the robot gets closer to the target, the likelihood of the robot approaches the correct target increases. In this case, the researchers believe more control power given to the robot can release the control workload of the operator. In addition, the major reason for this principle is the uncertainty of the user’s approaching intent in the unstructured task. We refer the policies that follow this principle as positive policies as the robotic agent’s control power is positively related to the decreased distance to the inferred target. This principle has resulted in various policy profiles by adjusting the power increase ratio, inflection points, or minimum and maximum control power [20, 31, 10]. Moreover, conflicting research results have been reported by various researchers about performance and preference [21, 32, 13].
In contrast to these positive policies, some researchers have argued that the robotic agent’s control power should decrease since the unavoidable uncertainty in automation execution as a combination of sensing uncertainty, algorithm uncertainty, and hardware uncertainty . This principle questions the robotic system’s accuracy and suggests increased failure when the robot is close to the target. We refer a policy that follows this principle as a negative policy, as the robot’s control power is negatively related to the decreased distance to the target. Little work has been reported about this negative policy, and the uncertainty modeling and arbitration modeling are still open questions.
Another open problem with those positive and negative policies is that they have not considered the various uncertainty levels. Different human intent inference mechanisms in different work environments and different robotic systems could result in various levels of uncertainty. The current work attempts to fit all uncertainty conditions with the same arbitration policy, which certainly results in unideal performance outcomes.
Besides the great inconsistencies of arbitration policies, the subjectively defined arbitration policies have not been well evaluated. After the implementation of an arbitration policy, measures, such as the task success rate, task completion time, and subjective surveys were only available to evaluate that policy [26, 7, 23]. Those metrics measure a policy’s overall performance at the macro level. As stated earlier that an appropriate arbitration policy gives appropriate amount of control power to each party at the correct time, those macro metrics cannot discover how a policy performs in dynamic. As a result, after evaluations of the policies, the researcher cannot quantitatively explain how such a performance is achieved and what take-aways are delightful for other researchers while they are defining their arbitration policy. Moreover, the success rate and completion time are performance-orientated instead of evaluating how well the robotic agent cooperates with the human operator. Novel metrics that can quantitatively explain how a robotic agent cooperates with the human operator at the micro level would be beneficial in studying the shared control policies.
This section introduces the formulation of the shared-control model in detail. There are three main modules,
human intent inference for the approaching target,
uncertainty modeling in the intent inference and robotic autonomy, and
formulation of the shared-control model.
Iii-a Complementary Intent Inference
Knowing the human intent is the prerequisite for providing timely and appreciate assistance. A multimodal intent inference method is developed here based on the natural eye-hand cooperation, as human eyes lead hands to fall on the manipulation target during the natural human manipulation. Thus, this multimodal method takes advantage of the complementary spatial-temporal information of the eye and hand modalities. Eye information gives the robotic agent earlier access to the manipulation target, while it could be unstable in spatial due to the nature of eye movements. The hand motion can strongly imply whether an object is the manipulation target, while it has considerable temporal lag and could be fuzzy when passing by an irrelevant object. Fusing two modalities allows the robotic agent to realize the human intent earlier with high confidence. Fig. 1 demonstrates the advantage of fusion of the eye-hand modalities. If using a single modality for the intent inference, the robotic agent may easily mistake or as the target. Fusing two modalities enables the robotic agent to avoid this mistake completely or realize and correct this mistake as early as possible.
Iii-A1 Intent Inference based on the Eye Modality
We formulate the human intent inference using the eye modality (1) similar to our previous work [17, 15, 16], and is a set of accessible objects that could be the target and is a sequence of eye-gaze data measured since eyes dwelling. In other words, the robotic agent infers the human intent
that maximizes the posterior probability while knowing a sequence of eye-gaze data. The eye-gaze data includes eye dwelling time, gaze speed, pupil dilation, and gaze concentration. The utilization of multiple eye-gaze features is to reduce the influence of visual distractions and thus to improve the accuracy. The eye-gaze data will also pass through a sliding window filter to remove high-frequency involuntary eye movements.
Iii-A2 Intent Inference based on the Hand Modality
The human intent inference from the hand modality employs a trajectory-based inference method [6, 8]. As shown in Fig. 1, given the historical trajectory , from the starting point to the current location , the robotic agent maximizes the following posterior probability (2). The probability
uses the principle of maximum entropy as the formula, where the probability of an object as the intent exponentially decreases as the cost of the approaching it increases with the given trajectory.
Iii-A3 Inference Fusion
The fusion of two modalities is through a Bayesian inference approach (3) . Given a sequence of eye-gaze data and the historical approaching trajectory, the intent inference is to maximize the fused posterior probability. The inferred target is annotated as the nominal target . This nominal target could be the same as the true target, but also could be different as this complimentary inference method or any other inference is to reduce the change of mistaking a wrong target to its best but cannot be able to eliminate this mistake.
Iii-B Uncertainty Modeling
Iii-B1 Confidence under Intent Inference Uncertainty
Due to the uncertainty introduced by the clustered environment, human’s ambiguous motion, and imperfect inference algorithm, it is inevitable to mistake a close-by object as the target. The uncertainty from various sources propagates for individual modalities. We model the propagated uncertainty of the intent inference using eye and hand modalities each as a three-dimension (3D) Gaussian distribution (4)-(5
), and it describes the true target distributes around the inferred target following a Gaussian distribution. This Gaussian distribution has the inferred target as the mean, and the variance is known by evaluating a modality’s historic inference performance, which is mean of squared deviations from the inferred targets to the true targets. In the equations,and are variances of eye and hand modalities respectively, and and , are eye and hand modalities’ variances in three axes, respectively. We assume there is no correlation between any two axes in the distribution for problem simplification. The intent inferred through fusing the eye-hand modalities consequently has a Gaussian uncertainty distribution (6) with as the mean and , as the variances on three axes . , are computed as (7). The scaling factor is a constant computed upon , , and .
We assume the uncertainty of the inferred intent has the same variance, on three axes, and this can be achieved by re-evaluating the deviation between the inferred intent and the true intent on three axes. Rewriting the uncertainty distribution function as (8), and thus the probability of one object being the true target only depends on its distance to the inferred target and the pre-known distribution variance.
Following the same principle that the closer the inferred target gets the more likely it is the true target, we define that intent confidence as a function of the distance to the inferred target and regulated by the uncertainty variance (9)-(10). is a constant threshold that defines the function range of shared control. Outside this range, there is too much uncertainty in the system and the robotic agent does not contribute to the control of the robot; within this range, the robotic agent shares the control with the human operator following the defined arbitration function. is a carefully defined regulation function of the and , and is a constant. Fig. 2 are samples of the confidence in human intent with various uncertainty levels, and the confidence gradually increases while the end effector approaching the target.
Iii-B2 Confidence under Autonomy Uncertainty
Intent uncertainty associates with the problem “which target”, while the autonomy uncertainty attempts to solve the problem “where the robot should approach to reach that target.” This ubiquitous uncertainty could be mainly from the sensing accuracy of the target location and hardware limitation or misalignment for reaching that location. Due to this uncertainty, the robotic agent faces potential failures to handle the task independently, and with higher uncertainty the lower the robotic agent’s confidence in handling the task. In addition, even though with the same level of uncertainty, the failure chance increases, and the confidence lowers when the end effector approaches the target. Those changes highlight the characteristics of a robotic agent’s confidence in handling the task independently.
To mathematically model this correlation between the autonomy uncertainty and the robotic agent’s confidence, we assume the sensing uncertainty , and the hardware uncertainty , follow two 3D Gaussian distributions and , respectively (11)-(12
). These two types of uncertainty can be estimated through a trial and error method.represents the distribution probability of target measure when the true target is at , and is the distribution’s variance matrix. represents the distribution probability of the end effector’s final location , when it attempts to approach , and is the distribution’s variance matrix. We also assume there is no correlation between any two axes in either distribution for problem simplification.
The distribution between and is (13
), and this is the integral of the joint distribution ofand . As there is no correlation between any two axes in or , the integral can be calculated along each axis separately (14)–(18). Thus, also follows a Gaussian distribution with as the mean and as the variance.
While the end effector approaches the target in an arbitrary manner, the probability of encountering the target at a distance
can be related to the cumulative distribution function. This is the integral in infinite space, with a spherical hollow of radius . For simplification, we can re-evaluate the variances so that three axes have the same variance . Thus, the encountering probability can be computed as (19) and (20). has a reversed sigmoid shape and is regulated by . In contrast, the failure probability of approaching can then be computed as .
We relate the robotic agent’s confidence in its autonomy to its failure probability, and we redefine this failure probability as (21) and (22). The is still the function range of the shared control, and is a constant selected to regulate the confidence decrease behavior. Fig. 3 demonstrates robotic confidence in autonomy as a function of the distance to the target. This new definition cannot only simplify the relationship while holding a certain degree of similarity to the original definition in (20) but also offers better controllability with the help of . Involvement of constraints the autonomy confidence always under when the end effector is within a distance to the target. For example, when the end effector is less than 2 cm away from the target the autonomy confidence is less than 0.45 and gradually reduces to zero.
Iii-C Formulation of Shared Autonomy
The arbitration weight of the robotic agent can then be defined as (23) as a combination of the confidence in the intent inference and robotic autonomy. is a function of the distance to the inferred target and also regulated by the level of uncertainty in the gaze modality, hand modality, sensing, and robot hardware. The final motion command sent to the robot’s end-effector is a combination of the human motion input , and the robotic agent’s motion input (24).
Fig. 4 illustrates the effects of two types of confidence, and , on the arbitration weight assigned to the robotic agent. If the human operator’s weight is higher than the robotic agent’s, the human operator is dominant (likely region close to the start point). When both types of confidence are high, the robotic agent becomes dominant earlier and contributes more to the motion of the end effector. When either type of confidence is low, the robotic agent contributes less. We name this new arbitration model as bell-shaped policy due to the policy profile and comparison with the positive and negative policies.
Iii-D Shared-Autonomy Framework
According to the above discussion, an uncertainty-aware shared control framework can be developed as shown in Fig. 5. The multimodal intent inference module will infer the human’s intended target by observing human’s eye-hand movements. The robot trajectory planning module will generate the assistive motion plans based on the inferred target and the robot’s current position. The autonomy uncertainty module will evaluate the system’s confidence in independently handling the task based on the distance from the current robot position to the target position and the level of autonomy uncertainty. The confidence of assistance will be calculated using and as equations (9) and (21), which will be fed into the control arbitration module for dynamically allocating control power between the human and the autonomous robot in real-time to regulate the robot’s action.
Iii-E Measures of Helpfulness and Friendliness
Fig. 6 demonstrates the definition of a robotic agent’s helpfulness and friendliness. Both the helpfulness and friendliness are defined at each timestep, and the average values through a complete trial could be calculated as an overall measure. At a certain timestep , the robot’s end effector is at , the true target is , and the robotic agent considers as the nominal target due to a combination of the intent uncertainty and autonomy uncertainty. The human operator attempts to approach the true target with a curved trajectory, and the instant motion input of the operator is , where
is a unit vector indicates the motion direction andis the motion magnitude. In contrast, the robotic agent attempts to approach the nominal target straightly, and its instant motion input is , where is a unit vector indicates the motion direction and is the motion magnitude. is the vector points to the target from the current end-effector position, and it can be represented as , where is a unit direction vector pointing to the target and is the distance to the target.
The helpfulness , of the robotic agent is defined as the -weighted projection of onto , which is the weighted unit travel distance in the direction of while traveling (25). ranges from -1 to 1, where a positive measure means the end-effector moves closer to the true target with the motion assistance of the robotic agent, while a negative measure indicates the robotic agent interferes the approaching to the target. In extreme conditions when either or is zero the robotic agent’s helpfulness is defined as zero since the end effector is not moving closer or further from the target.
The friendliness , of the robotic agent is defined as the agreement between the human operator’s motion input and the final motion command to the end effector (26). It is the cosine of the intersection angle of the and . ranges from -1 to 1. Measure 1 indicates the final motion is along with the human input, and the robotic agent compromises completely and is friendly to the human operator. In contrast, measure -1 indicates the final motion command is in the reversed direction from the operator’s input, where the robotic agent is arbitrary and unfriendly. When either or is zero, it is defined that the robotic agent has a friendliness -1. In addition, when both and are zeros, the agent’s friendliness is 1.
Iv Simulations and Experiments
This uncertainty-aware arbitration model is validated within both simulations and real experiments. The tasks are the same in both testing setups where one user teleoperates a MICO robotic arm with a screwdriver in its hand to approach six bolts. The testing setups can be demonstrated with Fig. 7. This setup mimics a manufacturing scenario with teleoperated robots. Six bolts are arranged in two parallel rows with two different heights, and the rows are rotated to avoid the row direction is along the axis direction of either the camera’s or robot’s frame. The home position of the robotic arm and the relative positions of the bolts to the arms are the same in both setups to achieve comparable results. For comparisons, one traditional positive policy and negative policy were implemented and tested too.
Iv-a Simulation Setup
In the simulation, the human input was synthesized to have a curved approaching trajectory to the target. The simulated human input at each time step was synthesized as in (27)-(28), and the prime symbol indicates it was a synthesized human input. ’s magnitude was defined by a constant , and its direction was obtained by rotating the direction vector, that pointed to the target from the current robot location, an angle . was the rotation matrix. gradually reduced proportionally to the robot-to-target distance, , and the initial angle,
, was initialized randomly following a normal distribution,. In the simulation, and took values of 20 and 10 respectively. In contrast to the curved trajectory, the robot agent attempted to go straight to the nominal target with its motion input (29). The robot input’s magnitude was equal to the smaller item among the human input magnitude and distance to the nominal target, . The nominal target deviated from the target because of human intent inference error and robot perception error (the two uncertainties). It was assumed that the input from the human had a constant speed, and the simulation was run on 20 Hz.
Both the human intent inference uncertainty and robot autonomy uncertainty were simulated with various levels. For the intent uncertainty, while approaching a bolt target, we assumed there was a certain time period that the robot considered another bolt as the approaching target and generated motion assistance toward the wrong target. After that certain time, the robot realized this mistake by re-inferring the intent with newly observed eye-hand data and started to assist in approaching the correct target. Thus, the effect of the intent uncertainty can be represented as a time period that the robot treats another object as the wrong target. Six levels of intent uncertainty were simulated, which reflected by the length of the time period of approaching the wrong target. This time period ranged from 0 seconds to 10 seconds. For the autonomy uncertainty, a certain offset was added to the actual location of the bolts in a randomized direction. Even though the offset’s direction was random, the magnitude was the same. Six levels of autonomy uncertainty were simulated ranging from 0 cm to 10 cm. The intent uncertainty only affected the robotic agent in the defined short time period, while the autonomy uncertainty affected the robotic agent all the time. Six levels of intent uncertainty and six levels of autonomy uncertainty gave 36 uncertainty combinations.
Three arbitration policies, the proposed arbitration model, and traditional positive and negative policies were tested with the same initial condition under the 36 uncertainty combinations. In one simulation trial, the human operator and the robotic agent cooperated to approach the six bolts as six independent approaching tasks. The inputs from the human operator and the robotic agent were blended using one of the three policies. Three simulation trials with applying each arbitration policy once comprised a set of simulation trials. The human initial offset angle, , was initialized first in a simulation set so that three policies were tested with the same initial condition for comparison fairness. Approaching one bolt was successful if the robot’s end-effector was close enough to the target bolt’s head, while it was a failure when the end effector was stuck at the nominal target. The success rate and the completion time of successful runs were recorded. The friendliness and the helpfulness were calculated at each time step during the approaching process, and the mean value was used to summarize the whole approaching process. One hundred simulations sets were performed for statistical analysis, and it resulted in 600 approaching tasks for each policy under an uncertainty combination.
Iv-B Experiment Setup
In the experiment, the human user sent the control command through a Geomagic Touch haptic joystick to approach three of the six bolts as one approaching trial. The robotic agent’s input still pointed to the nominal target and took a magnitude of the smaller value of the human input or the distance to the nominal target. To reduce the difficulty of the task, the approaching was only being performed in the camera’s X-Y plane and left the depth control free. It reduces the task difficulty as previous research concluded that the control in the depth direction was most difficult in teleoperation. For creating an identical testing environment across all participants and three arbitration policies, the intent inference uncertainty was simulated in the same way as in the simulation but with two uncertainty levels of 5 seconds and 10 seconds. The autonomy uncertainty was simulated in the same way as in the simulation with two levels of 1 cm and 3 cm offsets.
Each participant performed one approaching trial with each arbitration policy under one test setting. In the meantime, the testing order was randomized to minimize the order effects. The success rate and completion time were recorded. The inputs of the operator and robotic agent were recorded to compute the friendliness and helpfulness. After performing each trial, a short questionnaire was provided to obtain subjects’ positive and negative opinions on the assistance provided by the robotic agent. The questionnaire consisted of eight assessment statements about the robotic agent’s performance, and the subjects needed to mark their agreement level to each statement.
V-a Results of the Simulations
Four measures were taken from the tests in simulations: the task completion time, task success rate, and robotic agent’s helpfulness and friendliness. The task completion time is counted only for the successful trials, but the helpfulness and friendliness were computed for all trials. These three measures (completion time, helpfulness, and friendliness) are presented with boxplots to display the distribution. The exact values of completion time, friendliness, and helpfulness are not displayed in the plots as they are subjected to the pre-set human input, and the changing trend of those measures among various uncertainty conditions is more meaningful. Similarly, the exact uncertainty settings are abstracted to six uncertainty levels, Level 0 (L0) to Level 5 (L5). Level 0 means there is no uncertainty, and level 5 is the highest uncertainty. The Mann-Whitney U test was performed on the measures to statistically compare the positive and negative policies to the bell-shaped policy. Two significance levels were taken with p 0.001 as high significance (solid dots) and p 0.01 as moderate significance (circles). No statistical significance is notated with a cross.
V-A1 Success Rate and Completion Time
Fig. 8 summarizes the task completion times and task success rates of each policy under a certain uncertainty. If a policy achieved a success rate of less than 20%, its completion time distribution is discarded as it had insufficient data to perform the analysis and discussions, and if using a policy achieved a success rate of 100% this success rate is not displayed to keep the plot concise. Other than these, the success rates are displayed as percentages under the distribution plots.
The success rates of applying three arbitration policies greatly varied with uncertainty conditions, however, the bell-shaped policy had the best success rate all conditions. When no autonomy uncertainty was present and the intent uncertainty was not high (less than L5), three policies achieved a success rate of 100%. The negative and bell-shaped policies continued to have a success rate of 100% when the autonomy uncertainty increased from L1 to L5. However, the positive policy resulted in the robot’s end-effector stuck at the nominal target due to the autonomy uncertainty and had success rates lower than 20%. The L5 intent uncertainty undermined all three policies’ success rates, but the bell-shaped policy was consistently more successful than the other policies. Moreover, at L5 intent uncertainty, the success rate of applying the negative and bell-shaped policies increased with the increase of the autonomy uncertainty. This could because of the mutual effects on two types of uncertainties.
The completion times of applying three policies increased gradually when increasing the uncertainty levels. The bell-shaped policy was more efficient in overall. The negative policy and the bell-shaped policy are mainly compared here, as the positive policy either had the same completion time or its success rate was too low when autonomy uncertainty was absent or present. When both types of uncertainty were low, using the negative policy could accomplish the task quicker, and their difference was statistically significant. However, the negative policy’s advantage becomes smaller with increases of uncertainty in either intent or autonomy. The completion time of using the negative policy became comparable with the bell-shaped policy and eventually overpassed its. In 22 out of 36 uncertainty conditions, the bell-shaped policy’s completion was shorter. It was also noted that both policies’ completion time had larger increases when the intent uncertainty rose from L3 to L4 and from L4 to L5.
Fig. 9 summarizes the helpfulness of the robotic agent in various testing conditions with three arbitration policies. Generally, the negative policy had the highest helpfulness across all uncertainty conditions, and the bell-shaped policy had the lowest helpfulness in most of the conditions.
When no autonomy uncertainty was presented, the positive policy and the bell-shaped policy had the same helpfulness, which was significantly lower than the negative policy. While adding the autonomy uncertainty, the helpfulness of the three policies became lower. In the meantime, the difference between the negative policy and the bell-shaped policy was reducing, and their distribution overlaps appeared and grew larger. When both uncertainties were low (less than L2), the helpfulness of bell-shaped and positive policies were not statistically different. However, the helpfulness of the bell-shaped policy became lower than the positive policy when increasing the uncertainties. It is also noticed that the autonomy uncertainty greatly increased the distribution variance of the helpfulness.
Fig. 10 summarizes the friendliness of the robotic agent in various uncertainty conditions with three arbitration policies. Among all the uncertainty conditions, the bell-shaped policy had the highest friendliness, and its friendliness was statistically different from others. When the uncertainties were low, the friendliness advantage of the bell-shaped policy was small but grew larger when the intent uncertainty was high.
When no autonomy uncertainty was presented, the positive policy and the bell-shaped policy had the same friendliness. Both positive and bell-shaped policies’ friendliness was slightly higher than the negative policy, and this difference was statistically significant. Presence of the autonomy uncertainty could cause great drops to the positive policy’s friendliness, however, continuously increasing the autonomy uncertainty only led to mild decreases. Moreover, the friendliness decreases of the negative and bell-shaped policies were hardly observable when only increasing the autonomy uncertainty at lower levels of intent uncertainty (L4 or lower). This made the positive policy significantly unfriendly compared to the other policies. It was also noted that three policies’ friendliness had larger declines when the intent uncertainty rose from L3 to L4 and from L4 to L5.
V-B Results of the Experiments
Data from 370 approaching trials were collected from the experiment testing from ten participants. Four objective measures and one subjective questionnaire were recorded from the experiment. Four testing conditions are notated by the combination of uncertainty levels (such as LL indicates low intent and autonomy uncertainty, and LH indicates low intent but high autonomy uncertainty). The Mann-Whitney U test was also performed on the measures to separately compare the positive and negative policies to the bell-shaped policy. Two significance levels were selected with p 0.01 as high significance (solid dots) and p 0.05 as moderate significance (circles).
V-B1 Success Rate and Completion Time
Fig. 11 summarizes the task completion times and task success rates of the experiments. The success rates are displayed as percentages under the corresponding boxplot, and the time median is displayed inside the box.
It is evident that the bell-shaped policy is the most effective arbitration policy among all uncertainty conditions. Overall, the bell-shaped policy had the highest success rate, which was greatly higher than the positive policy and was higher by 6% than the negative policy on average. In contrast, the positive policy failed a lot during the experiment, and in two settings, it had a success rate that was low as 37%. Even though the successful trials of using the positive policy had a very low completion time, it was still sufficiently reasonable to conclude the positive policy function worst in the experiment. The success rate of the negative policy was close to the bell-shaped policy, but its completion time was much longer than the bell-shaped policy, and their difference was statistically significant.
The bell-shaped policy stably functioned well in four uncertainty conditions, as its accuracy and completion time had small variances. It suggested that the bell-shaped policy was robust to the uncertainty variances. In contrast, both positive and negative policies were affected by the uncertainty changes. The positive policy had a lower success rate when the autonomy uncertainty was high, while the negative policy was sensitive to the increase of intent uncertainty.
Fig. 12 summarizes the helpfulness of the robotic agent in the experiments when three arbitration policies were applied in various uncertainty conditions. It shows that the negative policy had the highest helpfulness in three uncertainty conditions, however, their differences were not statistically significant in most uncertainty conditions. The helpfulness of the positive and bell-shaped policies was close and not statistically different.
When comparing the three policies’ helpfulness distributions in various conditions (the portion between the 25th percentile and the 75th percentile), the bell-shaped policy’s helpfulness was relatively stable, while the negative policy varied the most. It was also noted that the negative policy’s helpfulness was decreasing when increasing either type of uncertainty, and reached its lowest when both types of uncertainty were high. In addition, the data suggested that the positive policy’s helpfulness was higher when the autonomy uncertainty was low.
Fig. 13 summarizes the friendliness of the robotic agent in the experiments when three arbitration policies were applied in various uncertainty conditions. It shows that the bell-shaped policy was the most friendly across all uncertainty conditions, and its friendliness advantage is apparent and statistically significant. In contrast, the negative policy was the most unfriendly.
The distribution of the bell-shaped policy’s friendliness did not change much with varying uncertainty conditions, which suggested its robustness in the term of friendliness. The positive policy’s friendliness decreased when autonomy uncertainty was increased. Interestingly, it is noted that there was a friendliness increase for the positive policy when intent uncertainty was higher, and there was a small friendliness increase for the negative policy when the autonomy uncertainty was higher.
Fig. 14 summarizes the subjective evaluation of the robotic agent when three arbitration policies were applied in various uncertainty conditions. The accumulative scores from positive and negative assessment portions are plotted separately, and the higher the score is, the better the subjective assessment is. It is apparent that the bell-shaped policy had the highest scores, and its score distribution was moderately different from the negative policy and significantly different from the positive policy. In addition, the positive policy had the lowest score in all the uncertainty conditions.
The score distributions of the bell-shaped policy in various uncertainty conditions were similar, which suggested that the bell-shaped policy functioned consistently in various uncertainty conditions and had the participants consistently given positive evaluations. The positive policy had consistently low scores in three uncertainty conditions, and its score in the condition of high intent and high autonomy uncertainties was the lowest. It seems that the negative policy was evaluated higher when the autonomy uncertainty was low.
Vi-a Policy Illustration
Fig. (a)a-(d)d demonstrate how each arbitration policy worked and affected the motion of the end-effector using a mini 2D simulation. The robot’s end-effector moved on a plane where the true target, nominal target, and the end effector were on. The end effector started from the origin to approach a target 200 mm away on the Y-axis. The human inputs were synthesized to have a curved trajectory as the robotic agent in the simulation tests. This mini simulation only considered the autonomy uncertainty, which was simulated the same way as in the simulations and experiments, and the uncertainty was at a moderate level to show distinguish differences between various policies. The trajectories of the end effector as a result of the blended human and robotic agent inputs show in Fig. (a)a, and three arbitration policies resulted in different motion trajectories. When using the positive policy, the approaching task failed with the end effector stuck at the nominal target. In contrast, applying the negative and bell-shaped policies successfully accomplished the task, and the negative policy (8.1s) had a shorter completion time than the bell-shaped policy (8.8s). Along three trajectories, nine points were taken when the end effector was respectively at the starting location (diamond markers), 110 mm away (circle markers), and 50 mm away (square markers) from the target. When the end effector was at those locations the directions of the human input motion (), the robot input motion (), the final motion input to end effector (), and the direction pointing to the target from the current position of the end effector () show in Fig. (b)b with various colors. Fig. (c)c and (d)d plot the robotic agent’s helpfulness and friendliness respectively, and the average helpfulness and friendliness are shown in the legends.
When the end effort was at the start location, the motion inputs of the human and robotic agents were the same across three policies. However, due to the various arbitration policies, different arbitration weights were assigned to the robotic agent, and final motions to the end effector were different. The positive and bell-shaped policies both assigned 0 to the robotic agent; thus the final motion command was along the human input. Consequently, the robotic agent was completely friendly to the human operator as it compromised completely and provided no help. In contrast, the negative policy had the final motion command along the robotic agent’s input, which resulted in helpfulness near 1 as the intersection angle between and was near 0. However, due to the large offset between the and , the robotic agent’s friendliness was around 0.5.
While approaching the target, the arbitration weights varied with the changed spatial relationship between the target and end effector and resulted in varying helpfulness and friendliness. From the start point to the point 110 mm away, more control power was assigned to the robotic agent by the positive and bell-shaped policy. Consequently, the robotic agent became more helpful as it was dragging the end effector close to the target even toward the nominal target. In the meantime, the end effector became less following the human operator, and friendliness gradually reduced. In contrast, the robotic agent’s helpfulness with the negative policy was reducing, but its friendliness was increasing due to the smaller arbitration weight to the robotic agent and more compromise to the human operator.
When the end effector was 50 mm away from the target, it followed the robotic agent completely to the nominal target due to the assertion of the positive policy. As drifting from the target largely, the helpfulness and friendliness of the robotic agent both started to decrease rapidly. In contrast, the negative and bell-shaped policies had returned a majority of the control power to the human operator at this close range. The robotic agent’s helpfulness was reducing because its arbitration weight was reduced and the motion to the nominal target had less contribution to moving close to the target at this close range. In addition, the robotic agent became more friendly as it compromised more.
When the end effector was far from the target, moving toward the nominal target also greatly contributed to the approaching of the target. Due to this reason, the negative policy had the shortest completion time and higher helpfulness. However, the introduction of the intent uncertainty could weaken this advantage or even convert it to a negative policy’s disadvantage since it could drag the end effector to somewhere else.
Vi-B Policy Evaluation
Three arbitration policies are comprehensively evaluated by combining their results in simulations and experiments. Firstly, using the bell-shaped policy had the best success rate over the other policies across the various uncertainty conditions, and it also resulted in shorter completion time. Using the negative policy could achieve a comparable but lower success rate than using the bell-shaped policy, and the negative policy’s completion time was much longer. The positive policy resulted in the worst success rates as it was so sensitive to the autonomy uncertainty. Secondly, while using the bell-shaped policy, the robotic agent was less helpful yet more friendly. In contrast, the robotic agent provided the most help but was less friendly using the negative policy. When using the positive policy, the robotic agent’s helpfulness was lower than using the negative policy and comparable with using the bell-shaped policy, and its friendliness was lower than using the bell-shaped policy. These two measures together indicated a robotic agent’s intrusiveness of helping the human operator. Thirdly, the bell-shaped had the highest subjective scores followed by the negative policy then the positive policy. This subjective score was the operator’s comprehensive evaluation of the combination of success rate, completion time, helpfulness, and friendliness.
In summary, the bell-shaped policy is evident to be more effectively arbitrate control power between the human operator and the robotic agent for better cooperation and better performance. Using the bell-shaped policy achieved higher success rates and shorter completion time regardless of the uncertainty variances in both simulations and experiments. It demonstrated the robustness of the bell-shaped policy in coping with the system uncertainty. Moreover, the bell-shaped policy is subjectively preferred by the human operators. The bell-shaped policy had lower helpfulness and higher friendliness, and this indicates the bell-shaped policy regulated the robotic agent’s help to be more nonintrusive but effective.
Vi-C Friendliness and Helpfulness
The newly developed friendliness and helpfulness were validated to quantitatively and objectively evaluate how an assistive robotic agent was friendly to the human operator and how effective in helping the human operator to accomplish the task. The robotic agent’s friendliness and helpfulness in simulations and experiments have many matched characters, which could be strong evidence that the metric definitions were reasonable and valid. Compared to the existing success rate and task completion time as the overall performance measurements (defined as the macro level in this paper as it is to quantify overall performance), friendliness and helpfulness could be microscopical or macroscopic which measure a robotic agent’s behavior at each timestep or through an overall trial. Moreover, these two metrics are cooperation-orientated that quantify how well two agents’ cooperation instead of the performance-orientated success rate and completion time. Two metrics enable researchers to explicitly explain how an arbitration policy affects the subjective and objective outcome, and thus to provide novel insights into the robotic agent and the shared-control paradigm.
One example of the new findings revealed is that many researchers believed that it was better for the robot to provide more help in the collaboration, which, however, was proved not always valid by the helpfulness measures from the simulations and experiments. The robotic agent using the negative policy had the highest helpfulness but achieved lower success rates and used longer time to accomplish the tasks than the less helpful bell-shaped policy. In the meantime, the robotic agent using the negative policy was also less friendly. High helpfulness meant the robotic agent was strongly pulling the robot’s end effort toward its perceived target; however, it could result in competition for the control of the end effector as the robotic agent was following a different trajectory plan that was unnatural for the human operator. This new finding inspires us to reconsider how a robotic agent should provide its assistance in order to achieve better performance.
Even though great insights into the robotic agent’s assistance were revealed by the helpfulness and friendliness, the definition could be further improved. The current definition did not take consideration of the motion inputs’ magnitudes. The input’s magnitude was certainly critical, but how it should be considered in an effective manner needs more investigations. Two metrics’ values at extreme conditions were defined discretely, and it arose questions on their legibility. In the meantime, both metrics may need to be scaled to increase their separating capability. Currently, both metrics ranges from 0 to 1 mostly as shown in Fig. (d)d. In addition, even the negative policy aggressively dragged the end effector to a different path while the bell-shaped policy had a better match with the operator’s desired trajectory, the friendliness difference between the bell-shaped policy and negative policy was often small numerically. For example, the bell-shaped policy’s trajectory was much closer to the human’s than the negative policy in Fig. (a)a, but their friendliness difference was only 0.1. Increasing the separating capability can increase various policies’ measure distinction and facilitate the comparison.
Vi-D Simulations and Experiments
The simulations and experiments were complimentary in evaluating the three policies. Simulations enabled more extensive tests of the three policies with less effort to reveal each policy’s characters in various uncertainty conditions, and real experiments were essential to verify the simulation results and the policies’ practical effectiveness since the great difficulty in simulating a real physical environment and human intuition. Even though the experiments were not at the same scale as the simulations, the high consistency in the simulation results and experiment results gave us high confidence in the effectiveness of the bell-shaped policy and the findings derived from the results.
There were three minor inconsistent instances in the simulation results and experiment results, which did not undermine the derived conclusions but deserved further investigations. The first one was the positive policy’s success rate. In experiments, the positive policy still had the lowest success rates, but they were not that low as in the simulations. The second instance was the statistical difference between each policies’ helpfulness. In the simulation, the bell-shaped policy’s helpfulness was statistically different from others. However, the experiment results only matched the simulation results in general but did not have the same statistical significance. The third inconsistent instance was that the positive policy’s friendliness was higher than the negative one. The human operators’ intuition could have greatly contributed to the success of using the positive policy, while they managed to avoid the nominal target and reach the true target. The randomness in human operators’ approaching trajectories could have reduced the margins between three policies’ helpfulness distributions. The extra-long stuck time before ending the simulation and more standstill moments in experiments were the major reasons why the positive policy had higher friendliness than the negative policy. In the simulation of the positive policy, the human operator would attempt to escape from the stuck condition before calling it a failure, and this attempting time was set longer than necessary. During this stuck time, the friendliness kept measuring -1 according to the definition as the friendliness trail shown in Fig.(b)b, and this extra-long trail lowered the positive policy’s friendliness. However, the experimenter terminated the approaching trial shortly after the end effector stuck. In addition, there were more standstill moments when using the positive policy in experiments. During the standstill moment, the robotic agent’s friendliness was continuously measured as 1. Fine-tuning the simulation termination criteria and implementation and refining the friendliness definition are desirable for future investigation.
There was a common limitation in both simulations and experiments that the task was simplified for the human operator. Operating a robot could be a very difficult task for the human operator due to the disembodiment and physical discrepancy, and this was the main reason why robotic assistance was desirable. However, this difficulty was not simulated in the simulations and had been simplified a great amount in the experiments. It would be necessary to test the bell-shaped and negative policies in an experimental setup closer to the practical application.
In this paper, we investigated the arbitration relationship between a human operator and a robotic agent in shared-control teleoperation. We believe that the lack of consideration of the multiple types of uncertainty in the human-robot system was one reason for the great inconsistency of the arbitration policies. To fill this gap, we modeled the multi-source uncertainty from the human intent inference process and robotic automation system. Different types of uncertainty affect the control arbitration differently. We then developed an arbitration model that comprehensively fused the uncertainty and regulated the control arbitration. The developed uncertainty model was based on a 3D Gaussian distribution, which was general to easily incorporate more types of uncertainty. Meanwhile, the arbitration model was also general and extendable to incorporate other types of uncertainty. The arbitration model was then evaluated with simulations and experiments with comparisons of the existing arbitration policies. The new arbitration model outperformed or performed equivalently to the current policies in all the uncertainty combinations across all the measures. In addition, we developed helpfulness and friendliness as two new objective and quantitative metrics to reveal how well a robotic agent cooperated with the human operator under an arbitration policy and explain how the policy functioned and influenced the motion commends in dynamic at the micro level. The two new metrics can better analyze the arbitration policies to uncover the limitations or strongpoints. With the work in this paper, we expect the advancement of shared control in teleoperation for practical deployments.
This material is based on work supported by the US NSF under grant 1652454. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the National Science Foundation.
-  (2010) Semi-autonomous stability control and hazard avoidance for manned and unmanned ground vehicles. In the 27th Army Science Conference, Orlando, Florida, USA, November 29-December 02, 2010, pp. 1–8. Cited by: §II.
-  (2010) EMG-based control of a robot arm using low-dimensional embeddings. IEEE Transactions on Robotics 26 (2), pp. 393–398. Cited by: §I.
Products and convolutions of gaussian probability density functions. Tina-Vision Memo 3 (4), pp. 1. Cited by: §III-B1.
-  (2013) A review of data fusion techniques. The Scientific World Journal 2013. Cited by: §III-A3.
-  (2002) Characterizing efficiency of human robot interaction: a case study of shared-control teleoperation. In IEEE/RSJ international conference on intelligent robots and systems, Vol. 2, pp. 1290–1295. Cited by: §I.
-  (2013) A policy-blending formalism for shared control. The International Journal of Robotics Research 32 (7), pp. 790–805. Cited by: §I, §II, §III-A2.
-  (2017) Safeguarding autonomy through intelligent shared control. In Unmanned Systems Technology XIX, Vol. 10195, pp. 101950V. Cited by: §II.
-  (2016) Human-in-the-loop optimization of shared autonomy in assistive robotics. IEEE robotics and automation letters 2 (1), pp. 247–254. Cited by: §III-A2.
-  (2008) Speculation on the neuropsychology of teleoperation: implications for presence research and minimally invasive surgery. Presence 17 (2), pp. 199–211. Cited by: §I.
-  (2018) Shared autonomy via hindsight optimization for teleoperation and teaming. The International Journal of Robotics Research 37 (7), pp. 717–742. Cited by: §II.
-  (2015) Shared autonomy via hindsight optimization. Robotics science and systems: online proceedings 2015. Cited by: §I.
-  (2011) Dual-user teleoperation systems: new multilateral shared control architecture and kinesthetic performance measures. IEEE/ASME Transactions on Mechatronics 17 (5), pp. 895–906. Cited by: §I.
-  (2011) How autonomy impacts performance and satisfaction: results from a study with spinal cord injured subjects using an assistive robot. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 42 (1), pp. 2–14. Cited by: §II.
-  (2005) Teleoperation of a robot manipulator using a vision-based human-robot interface. IEEE transactions on industrial electronics 52 (5), pp. 1206–1219. Cited by: §I.
-  (2013) Attention-aware robotic laparoscope for human-robot cooperative surgery. In 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 792–797. Cited by: §III-A1.
-  (2015) Attention-aware robotic laparoscope based on fuzzy interpretation of eye-gaze patterns. Journal of Medical Devices 9 (4), pp. 041007. Cited by: §III-A1.
-  (2014) Implicit human intention inference through gaze cues for people with limited motion ability. In 2014 IEEE International Conference on Mechatronics and Automation, pp. 257–262. Cited by: §III-A1.
-  (2015) Continuous role adaptation for human-robot shared control. IEEE Transactions on Robotics 31 (3), pp. 672–681. Cited by: §I.
-  (2014) Stereoscopic visualization and 3-D technologies in medical endoscopic teleoperation. IEEE Transactions on Industrial Electronics 62 (1), pp. 525–535. Cited by: §I.
-  (2007) Mixed initiative control of autonomous vehicles. In Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 1431–1436. Cited by: §II.
-  (2002) Effect of virtual fixture compliance on human-machine cooperative manipulation. In IEEE/RSJ international conference on intelligent robots and systems, Vol. 2, pp. 1089–1095. Cited by: §II.
-  (2015) Autonomy infused teleoperation with application to BCI manipulation. arXiv preprint arXiv:1503.05451. Cited by: §I.
-  (2017) Human-robot mutual adaptation in shared autonomy. In 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI, pp. 294–302. Cited by: §II.
-  (2004) Methods for haptic feedback in teleoperated robot-assisted surgery. Industrial Robot: An International Journal 31 (6), pp. 499–508. Cited by: §I.
-  (2015) Uncertainty-based arbitration of human-machine shared control. arXiv preprint arXiv:1511.05996. Cited by: §II.
Shared autonomy via deep reinforcement learning. arXiv preprint arXiv:1802.01744. Cited by: §II.
-  (2002) Contribution of neuroscience to the teleoperation of rehabilitation robot. In IEEE International Conference on Systems, Man and Cybernetics, Vol. 4, pp. 6–pp. Cited by: §I.
-  (2016) Towards a multidimensional perspective on shared autonomy. In 2016 AAAI Fall Symposium Series, Cited by: §II.
-  (2005) User modelling for principled sliding autonomy in human-robot teams. In Multi-Robot Systems. From Swarms to Intelligent Automata Volume III, pp. 197–208. Cited by: §II.
-  (2016) Medical robotics and computer-integrated surgery. In Springer handbook of robotics, pp. 1657–1684. Cited by: §I.
-  (2009) Position and force augmentation in a telepresence system and their effects on perceived realism. In World Haptics 2009-Third Joint EuroHaptics conference and Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, pp. 226–231. Cited by: §II.
-  (2012) Assisted teleoperation strategies for aggressively controlling a robot arm with 2D input. In Robotics: science and systems, Vol. 7, pp. 354. Cited by: §II.
-  (2016) Adaptive control for teleoperation system with varying time delays and input saturation constraints. IEEE Transactions on industrial electronics 63 (11), pp. 6921–6929. Cited by: §I.