I Introduction
We consider scheduling a human operator who processes a sequence of tasks that are similar in difficulty over a fixed duration. Performance of a human operator is closely related to his/her workload. YerkesDodson law [1] states that human operators perform worse when their workload is too high or too low. We use utilization ratio to keep track of the operator’s past workload. Utilization ratio of the operator increases when the operator processes a task, and decreases when the operator idles (rests). We enforce explicit constraints in the form of minimum and maximum allowable utilization ratios, in order to keep the performance of the human operator high.
This problem is intimately related to a general class of problems in communication and information theory. In the communication and information theoretic treatment of certain modern applications, the channel can no longer be modeled as static or i.i.d. over time. In such applications, the characteristics of the communication channel changes as a function of its past utilization. Examples include, for instance, channel that die [2], channels that heat up [3, 4, 5], channels that wear out over time [6]
, binary symmetric channel where the crossover probability changes over time due to usage
[7], channels that get biased over time [8], and queuing systems where the service quality of the queue depends on the queue length [9]. In particular, we will see a remarkable similarity between scheduling a human operator subject to utilization ratio constraints in this paper and scheduling a communication channel subject to temperature constraints in [4, 5].When the operator processes a task, the system receives a certain reward (utility). We model the utility function, , as a monotonically increasing concave function of the processing time. Examples of such utility functions are observed, for instance, in speed accuracy tradeoff (SAT) studies [10] where the utility function is modeled as an exponential growth to a saturation point, as ; in ratedistortion [11, Eqns. (1), (3), (4)] where the time required to achieve an outcome with a certain distortion under a fixed energy is ; and in many scenarios where more time spent on a task results in diminishing (sublinear) returns over time, e.g., a runner can make considerable improvement at the initial stages of training, a student can quickly answer easier parts of the questions, a monitor can quickly determine the general area where a target is, however, in each of these examples, getting a higher running performance, solving difficult parts of the questions, detecting the target with more precision, require much more time, and returns become sublinear. Another simple such sublinear function is .
Our work is most closely related to [12, 13, 14, 15]. In [15]
, the authors consider sigmoid functions for the utility function, whereas here, we consider monotone increasing concave functions.
[15] imposes minimum processing times for the tasks, prioritizes the tasks, and considers the case where some tasks are mandatory. In our paper, there is no minimum time allocation for the tasks, and all the tasks are identical in importance and difficulty. Thus, our model can be viewed as a simplified version of [15]. Our goal for this simplification is to obtain general and structural results, as we discuss next.In this paper, we consider a scheduling problem for a single human operator who performs tasks similar in difficulty over a given fixed time. The number of tasks and the total duration are known a priori. The structural solution for this problem consists of two major subpolicies: In the first policy, the operator starts processing tasks, and continues to process tasks without idling until he/she reaches the allowable upper bound for his/her utilization ratio. In the second policy, which starts after the operator reaches the allowable upper bound for the utilization ratio, the operator must idle (rest) during each task. We show that the operator should allocate equal time for each task it performs in the same subpolicy. However, the times allocated for the tasks performed during different subpolicies may be different even though the tasks are identical in difficulty. We note that the structure of the utilization ratio here is similar to the evolution of the temperature in the case of single energy arrival in [4].
Ii System Model and the Problem
We consider a system where a human operator processes tasks over a duration of units of time, see Fig. 1. We model the utilization ratio, , where , as [15],
(1) 
where if the operator is working at time , and if the operator is idling at time , and (which is denoted as in [15]) is a constant that depends on the resistance of the operator to the workload.^{1}^{1}1Note the similarity between utilizationworkload equation in (1) and temperaturepower equation in [4, Eqn. (3)]. In particular, (1) here is the same as [4, Eqn. (3)] when , and in [4, Eqn. (3)], with the mapping of utilization ratio () and workload () here to temperature () and transmit power (), respectively, in [4]. Here, increases when the operator is working, and decreases when the operator is idling. After resting for and working for , evolves as,
(2) 
According to the YerkesDodson law [1], the performance of the operator will be worse if the utilization ratio is too low or too high. Therefore, we aim to keep between a prespecified minimum, , and maximum, . For each task , the operator works (processes the task) for seconds and rests (idles) for seconds. Without loss of generality, we assume that the operator idles first (if any) before processing a task. We denote by the utilization ratio right after the operator finishes processing task . We denote by the utilization ratio right before the operator starts processing task , i.e., right after the operator finishes resting (if any) for task ; see the top part of Fig. 1. Thus,
(3)  
(4) 
Due to the monotonicity of during processing and idling periods, if the initial utilization ratio is between and , it suffices to check the utilization ratio only at the ends of idling and processing times, i.e., at and , to make sure that it is between and at all times. The reward acquired from task is . Thus, we formulate the problem,
s.t.  
(5) 
which we solve in the rest of this paper.
Iii Structure of the Optimal Solution
In this section, we identify some important properties of the optimal solution for the problem given in (II). First, the following lemma states that, in the optimal solution, if the total time is not completely utilized, then the operator must have hit the minimum and maximum allowable utilization ratios for every task by resting and working as much as possible.
Lemma 1
In the optimal policy, if , then and , for all .
Proof: We prove this by contradiction. Assume that and one of the following cases is true: i) and , ii) and , or iii) and . Consider case i). Since the total time constraint is inactive, we can increase without violating any other constraints, and then, increase the corresponding . The resulting new policy gives strictly higher reward. In this case, we can increase the reward until either , or and . Thus, if , then cannot be optimal. In case ii), we can increase and so that the policy is still feasible and gives higher reward. We can continue to increase until either , or and . Thus, if , then cannot be optimal. In case iii), we can apply the procedure in ii) first to make , which will bring the setting to the case in i), and we can apply the process in i) next.
Therefore, in the remainder, we focus on the case where the allowed time is completely utilized. Then, at time , the utilization ratio will either reach its maximum allowed value or not. The following lemma identifies the optimal solution when the utilization ratio does not reach at .
Lemma 2
In the optimal policy, when : if , then and , for all .
Proof: We prove this by contradiction. Assume that and for some . Choose the maximum task index, , such that . Since the operator idles before processing a task, the operator completes the remaining tasks after idling for , and without idling for the rest of the tasks. Since , we can decrease and increase the processing times of the remaining tasks. The new policy is still feasible and gives a larger reward. We continue to apply this process until either or . If and , then we choose the next highest task index such that and apply the same procedure. At the end, either or if , then , for all . Thus, if at the end , we have all in this case, and the transition from to can be expressed as,
(6) 
Note that in this case, transition of utilization ratio is independent of the time allocated to each task. Since the reward function is a symmetric sum of concave functions, allocating equal amount of time for each task gives the highest reward. Thus, for all is optimal, if .
Thus, in the remainder, we focus on the case where the allowed time is completely utilized and the utilization ratio at the end reaches , that is . The following lemma states that, in this case, if the operator does not reach for a task, then he/she should not idle for that task.
Lemma 3
In the optimal policy, when and : for any given task , if , then .
Proof: We prove this by contradiction. Assume that there is an optimal policy such that there exists an index where and . From Lemma 2, we know that if for some , then . Thus, is satisfied at least once at . Let be the largest such that and , and choose the smallest such that and . We know that exists since satisfies the condition. Then, we construct a new feasible policy such that the difference of is decreased by and the difference of is increased by , by decreasing the resting time of task . We denote the new policy with primes. The original and new policies are shown in Fig. 2. Then,
(7)  
(8) 
where is the resting time for the th task in the new policy, and , . Since is increased by , is also changed to be where . Then,
(9)  
(10) 
Since , we have which implies . Thus, we can decrease the time for idling by an amount of in the new policy, and utilize the extra time for the processing times of the task(s) in between and . Thus, the new policy will give strictly higher reward. We can continue to apply this procedure until either or . If , then we determine a new among the remaining tasks with the highest index such that and . Then, we apply the same procedure: If , we choose the smallest task index with and . We continue to apply this procedure until for all such that .
Lemma 3 implies that, for a given , if , then , i.e., if the operator rests during processing a task , then he/she must reach at the end of processing that task. This also implies that once the operator reaches for the first time, since he/she needs to rest to continue, the utilization ratio should hit the upper bound after processing each and every task after this point on. The next lemma states that the operator should allocate equal amount of time for each task he/she processes after reaching the maximal allowable utilization ratio once.
Lemma 4
After the point where the utilization ratio reaches for the first time (or the point where processing another task would increase the utilization ratio beyond ), the operator spends equal amount of time for processing each remaining task.
Proof: After reaching , the operator needs to idle in order to process another task. From Lemma 3, we know that once the operator idles, his/her utilization ratio needs to reach again. Consider tasks and where and are both nonzero. Assume for contradiction that . Without loss of generality, assume , which also implies . Then, we have . Consider a new policy where , , , and . We can choose them in such a way that . Let , , , and denote the utilization ratios of tasks and for the original and new policies such that and ; see Fig. 3. Then, for task ,
(11)  
(12)  
(13) 
Similarly, for task ,
(14)  
(15)  
(16) 
Since , we have . Thus, due to . Also, . Note that the total task processing time is increased and the difference between time allocations is decreased. This new policy will give strictly larger utility due to the monotonicity and concavity of the utility function . Thus, we reached a contradiction and cannot be optimal.
Iv The Optimal Solution
The optimal solution is composed of two major policies: Policy 1, where the operator processes tasks without idling until either he/she reaches for the first time or processing another task would force him/her to exceed the allowed so he/she needs to stop without reaching ; and policy 2, which starts either when the operator reaches for the first time, or when processing another task would force him/her to exceed . After reaching for the first time, the operator alternates between resting (idling) and processing tasks in equal amounts. We define as the number of tasks processed in policy 1. We define and to be the processing times for tasks processed in policy 1 and policy 2, respectively. We define to be the idling time right before the operator reaches for the first time, and to be the idling time after the operator reaches . Note that might not always exist. Next, we describe the optimal solution in terms of these.
When the operator starts with an initial utilization ratio , there are two options: either is high enough that the operator needs to rest before beginning to process any tasks (an example of this is given in Fig. 4), or is small enough that, from Lemma 3, the operator processes some number of tasks without idling until he/she reaches . If is sufficiently small, then the operator can process all of the tasks without any idling as described in Lemma 2. In this special case, which means that all tasks are processed in policy 1 and and for all tasks. An example of this particular case is shown in Fig. 5. In the case when is high, the optimal policy is: , , , and , . Note that for this special case which means that all the tasks are processed in policy 2. An example of this particular case is shown in Fig. 4.
The two cases described above are special cases where all the tasks are processed either in policy 1 or in policy 2. In general, some of the tasks are processed in policy 1 and the remaining tasks are processed in policy 2. These cases correspond to . For this, there are two possibilities: In the first possibility, the operator can reach for the first time without idling. An example of this shown in Fig. 6. In this case the optimal policy is: , , , and and , . Note that there is no in this case. In the second possibility, the operator will need to rest just before he/she reaches for the first time. An example of this is shown in Fig. 7. In this case, the optimal solution is: , , and , and and , . Note that we can determine from . Thus, in general, in order to completely characterize the optimal solution, we need to solve for , and . In the following lemma, we further characterize and .
V Numerical Results
In this section, we give simple numerical examples for the optimal solution. In the first example, we take , , , and . The optimal policy for this case is to process all the tasks without idling. This example corresponds to the special case described in Lemma 2, where . Therefore, the optimal policy is to allocate and , for all .
In the second example, we take , , , and . The optimal solution is and , , . The evolution of in this case is as in Fig. 6, where there is no .
In the third example, we take , , , and . The optimal solution is and , , . The evolution of in this case is as in Fig. 7, where there is an . In the second and third examples, we observe that .
References
 [1] R. Yerkes and J. Dodson. The relation of strength of stimulus to rapidity of habitformation. Journal of Comparative Neurology and Psychology, 18(5):459–482, November 1908.
 [2] L. R. Varshney, S. K. Mitter, and V. K. Goyal. Channels that die. In Allerton Conference, September 2009.
 [3] T. Koch, A. Lapidoth, and P. Sotiriadis. Channels that heat up. IEEE Transactions on Information Theory, 55(8):3594–3612, July 2009.
 [4] O. Ozel, S. Ulukus, and P. Grover. Energy harvesting transmitters that heat up: Throughput maximization under temperature constraints. IEEE Transactions on Wireless Communications, 15(8):5440–5452, April 2016.
 [5] A. Baknina, O. Ozel, and S. Ulukus. Energy harvesting communications under explicit and implicit temperature constraints. IEEE Transactions on Wireless Communications, 16(10):6680–6692, October 2018.
 [6] T. Wu, L. Varshney, and V. Tan. Communication over a channel that wears out. In IEEE International Symposium on Information Theory, June 2017.

[7]
D. Ward and N. Martins.
Optimal remote estimation over usedependent packetdrop channels.
IFACPapersOnLine, 49(22):127–132, November 2016.  [8] D. Ward, N. Martins, and B. Sadler. Optimal remote estimation over action dependent switching channels: Managing workload and bias of a human operator. In American Control Conference, July 2015.
 [9] A. Chatterjee, D. Seo, and L. Varshney. Capacity of systems with queuelength dependent service quality. IEEE Transactions on Information Theory, 63(6):3950–3963, March 2017.
 [10] N. Mulligan and E. Hirshman. Speedaccuracy tradeoffs and the dual process model of recognition memory. Journal of Comparative Neurology and Psychology, 34(1):1–18, 1995.
 [11] A. Arafa and S. Ulukus. Near optimal online distortion minimization for energy harvesting nodes. In IEEE International Symposium on Information Theory, June 2017.
 [12] M. Donohue and C. Langbort. Tasking human agents: A sigmoidal utility maximization approach for target identication in mixed teams of humans and uavs. In AIAA Guidance, Navigation, and Control Conference, August 2009.
 [13] K. Savla and E. Frazzoli. A dynamical queue approach to intelligent task management for human operators. Proceedings of the IEEE, 100(3):672–686, November 2012.
 [14] V. Srivastava, R. Carli, C. Langbort, and F. Bullo. Task release control for decision making queues. In American Control Conference, June 2011.
 [15] V. Srivastava, A. Surana, and F. Bullo. Adaptive attention allocation in humanrobot systems. In American Control Conference, June 2012.
Comments
There are no comments yet.