I Introduction
The strive for higher computational power has brought about the multicore platforms as a compelling solution first in general purpose and now also in the embedded realtime systems arena. Rather than relying on the increase of the throughput of single processors, the multicore paradigm, while providing its ability to perform a greater number of simultaneous calculations, has given rise to a new challenge. It often forces system designers to utilize the hardware facilities and use parallel algorithms in order to perform tasks of high computational demand in a predefined time window. However, this implies a subtle difference in the way schedulability conditions are posed since parts of the workload from the same task are allowed to execute concurrently; each task is then referred to as a parallel or Directed Acyclic Graph () task. This paper presents a framework to address this issue for fully preemptive global fixed task priority () schedulers and homogeneous multicores in which all cores have the same computing capabilities and are interchangeable. It is worth to mention that schedulers are commonly adopted and supported out of the box on several industry grade realtime operating systems such as [11].
Related Work
Valuable works such as [5, 12, 1, 9, 4] addressed the scheduling problem of tasks upon homogeneous multicores. Saifullah et al. [12] presented a method to decompose a generic task into a set of virtual sequential tasks and after the decomposition, the popular global earliest deadline first (GEDF) densitybased schedulability test is applied. Andersson and Niz [1] presented an analysis for GEDF where an upper bound on the workload that each task may execute in a given time window is computed. Nevertheless, this upperbound is computed for a special case of DAG tasks, namely the “forkjoin” tasks. For such a task: () the parallel workloads have the same execution requirement; () they are spawned after a common point; and () they join again after a common point. Note: When a task is executing a section of workload in parallel no further path forks can occur. Chwa et al. [5] provided a method to compute the interference that each task would suffer in a system of socalled “synchronous parallel” tasks – Each task is composed of multiple and potentially contiguous regions of parallel workloads with distinct parallelism levels –. In more than one aspect tasks cover a broader area as they allow for parallel workloads to yield distinct execution requirements and a different immediate predecessor for each node. Previous works using schedulers exist, but in a partitioned environment, i.e., tasks are assigned to cores at design time and no migration is allowed at runtime [6, 8]. For example, Lakshmanan et al. [8] presented a basic form of tasks, namely “Gang tasks”, in which all the parallel workloads have to be scheduled simultaneously on the processing platform.
This Research
In this paper, we present a sufficient schedulability test applicable to constrained deadline DAG tasks (see Section II for a formal definition) scheduled by using a GFTP scheduler on a homogeneous multicore platform.
Ii System Model
Task specifications
We consider a taskset composed of sporadic tasks. Each sporadic task , , is characterized by a , a relative deadline and a minimum timespan (also called period) between two consecutive activations of . These parameters are given with the following interpretation. Nodes in
(also called subjobs in the literature) stand for a vector of execution requirements at each activation of
, and the edges represent dependencies between the nodes. A node is denoted by , with , where is the total number of nodes in . The execution requirement of node is denoted by . A direct edge from node to node , denoted as , implies that the execution of cannot start unless that of has completed. In this case, is called a parent of , while is its child. We denote the set of all children of node by and the set of all parents of node by . If , then and may execute concurrently. In this case, we state that , and reversely, . A node without parent is called an entry node, while a node without child is called an exit node. We assume that a node can start executing only after all its parents have completed. For brevity sake, we consider only tasks with a single entry and exit nodes. For each task , we assume , which is commonly referred to as the constrained deadline task model. Figure 1 illustrates a task with nodes. Note: the analysis presented in this paper is easily tunable for tasks with multiple entry and exit nodes.The total execution requirement of , denoted by , is the sum of the execution requirements of all the nodes in , i.e., . The task set is said to be schedulable, if can schedule such that all the nodes of every task meet its deadline .
Definition 1 (Critical path).
A critical path for task , denoted by , is a directed path that has the maximum execution requirement among all paths in .
Definition 2 (Critical path length).
The critical path length for task , denoted by , is the sum of execution requirements of the nodes belonging to a critical path in .
Platform and scheduler specifications
We consider a platform consisting of unit capacity cores, and a fully preemptive scheduler. That is: () a priority is assigned to each task at system designtime and then, at runtime, every node inherits the priority of the task it belongs to; () different nodes of the same task may execute upon different cores; and finally () a preempted node may resume execution upon the same or a different core, at no cost or penalty. We assume that each node may execute on at most one core at any time instant and that the lower the index of a task the higher its priority.
Iii Timing Analysis and SelfInterference Extraction
Intrinsically, some nodes of a given task may prevent some others of the same task from executing. This constitutes a form of selfinterference. Since may be viewed as a set of paths, say , each path represents a set of sequential nodes in connected to each other via an edge, i.e., from the viewpoint of any node of , the other nodes of are either children or parents. We denote the complementary set of which contains all the nodes that do not belong to by . Note: the nodes in are not necessarily concurrent to all the nodes in .
Let be the set of all partial paths in which connect nodes and , and let be a specific path. For brevity sake, we denote by for the remainder of this paper. Since by definition of , each path has a worstcase execution requirement which is computed by summing up the execution requirements of all its nodes, i.e., . Note: also has a critical path defined as in Definition 1, i.e., the path with the largest execution requirement between and . It is fairly straightforward that if either or is not part of the endtoend critical path then it follows that the critical partial path between and is not contained in either. Now we can quantify the maximum selfinterference that a task may generate on a given subset of .
Let denote the worstcase response time of the task – The response time of every activation of is the timespan between its workload completion and its release – Hence is the largest value from all the activations of . On the roadway for the computation of an upperbound on , there are some important checkpoints we must investigate.
Definition 3 (partial worstcase response time).
The partial worstcase response time of the set of partial paths is the largest timespan between node completion time and node release time.
Lemma 1 (Critical Selfinterference Path).
Considering only selfinterference, the partial path of which leads to the worstcase response time of is the critical partial path among all partial paths in .
Proof (made by contradiction).
Initially for any other partial path . Baker and Cirinei [2] provided an upperbound on the interference of a Liu & Layland (LL) task (in the LL model, each task generates a potentially infinite sequence of jobs and is characterized by a 3tuple , where is the worstcase execution time of each job, is the relative deadline and is the minimum interarrival time between two consecutive jobs of ) on a multicore platform (). In this work, we extend this result to compute the interference that concurrent nodes induce on in the same manner (see Eq. 1).
(1) 
Let us assume that for some we have:
(2) 
Then it follows that:
(3) 
Since
(4) 
By substituting Eq. (4) into Eq. (3), Eq. (2) leads us to:
(5) 
which trivially means , contradicting the initial assumption. The Lemma follows. ∎
Informally speaking Lemma 1 infers, for any nonparallel pair of fringe nodes and , that an upperbound on the response time of is obtained by considering between any and . At the same time, the nodes which do not belong to are assumed to induce the maximum interference over it. As this is proven for any pair of nodes, the result also holds for the extreme nodes.
Now we focus on deriving the critical path in . For every node in , we denote by and its earliest and latest release times, respectively. Note: These quantities can be computed through a breadthfirst [10] traversal of . Assuming and are the entry and exit nodes of , the earliest release time of any node without any interference can be computed as follows.
(6)  
(7) 
where is the minimum execution requirement of . In the same manner, a breathfirst traversal of starting from provides the latest release time of as follows.
(8)  
(9)  
(10) 
Eq. (7) and Eq. (10) clearly represent a lower and an upperbound on the bestcase and worstcase start times of node , respectively. This can be observed in the following two scenarios: Node does not suffer any external interference and all its parents request for their minimum execution requirements purveys ; Node suffers the maximum possible external interference and its parents request for their maximum execution requirements purveys . With these equations, we can derive the worstcase response time of in isolation, denoted by . To do so, without explicitly referring to , we compute the critical path length of as the two problems can be addressed separately. From Eq. (8) and (9) and by starting from the exit node of , is obtained as follows.
(11)  
(12)  
(13) 
For any , the execution requirement of the nodes in is yielded by . From Lemma 1, an upperbound on the response time of task , including only the selfinterference is thus given by:
(14) 
Iv Upperbound on the Interference and Schedulability Condition
In this section we provide an upperbound on the interference of any task and we derive a sufficient schedulability condition. To this end, we distinguish between two scenarios: The scenario where does not suffer any interference from higher priority tasks, and The scenario where suffers the maximum possible interference. For brevity sake we assume that all tasks have carryin at this stage, and will relax this assumption in Section V.
Regarding Scenario , we recall that is a lowerbound on the release time of node . This leads to an upperbound function on the workload request of at any time (see Figure 3) defined as follows.
(15) 
Since the workload request of the task is the sum over the workload requests of all its nodes, then an upperbound on the workload request of at any time (see Figure 4) is defined as follows.
(16) 
where and .
Regarding scenario , is assigned to a core at most time units after the task is released and is released at most time units after node has started execution. This leads to a lowerbound function on the workload request of at any time (see Figure 3) defined as follows.
(17) 
A lowerbound on the workload request of at any time (see Figure 4) is thus defined as follows.
(18) 
where .
Eq. (16) and Eq. (18) can be used to obtain an upperbound on the workload request of in a time window of length . To this end, we consider that an activation of occurs time units prior to the beginning of the targeted window. Then two situations can lead to increasing the workload request of in the window: At the beginning of the window, say at time , suffers the maximum possible interference and its nodes are released as late as possible; At the end of the window, say at time , does not suffer any interference and its nodes are released as early as possible.
Lemma 2 (Upperbound on the Workload of with Carryin).
Assuming task has carryin, an upperbound on its workload request in a window of length is given by:
(19) 
Proof Sketch.
We consider an activation of occurring at time . The worstcase scenario for task is when it is prevented from execution on any core by higher priority tasks in the interval . Let us assume this worstcase scenario and let us assume that all nodes are released at time but one specific node is released at time . Since is a lowerbound on the workload request of at any time , it follows that the workload executed after , when is released at time , is greater than or equal to the workload request of when it is released at time . Hence on the left border of the window of length (i.e., at the beginning of the window), if the nodes are assumed to be released as late as possible, then the workload request in the window is maximized. On the right border of the window (i.e., at the end of the window), we assume the earliest release time of all the nodes but one specific node . By applying the same logic, it follows that the workload request in the window is maximized since the nodes are assumed to be released as early as possible and is an upperbound on the workload request of at any time .
Now, let denote the maximum number of parallel nodes in . We recall that the summation of the workload requests of all the nodes is a piecewise linear function, where each segment has its first derivative in the interval . In order to compute the maximum workload request of each task in an interval of length , we must evaluate the workload request in all windows of length assuming an offset . Since on the one hand the first derivative of (resp. the first derivative of ) is clearly periodic from time with a period (see Fig. 4), and on the other hand, the next activation of occurs only at time , it is not necessary to check the offsets over as there is no extra workload after by construction. Hence and the lemma follows. ∎
In order to obtain the solution of Eq. (19), instead of exaustively testing all the values of in the continuous interval , we derive the finite set of offsets which maximizes it hereafter.
As previously mentioned, both and are piecewise linear functions. Hence, the set of points where the first derivative of increases and the set points where the first derivative of decreases should be considered respectively at the left and at the right border of the targeted window of length . The points in these sets maximize the workload request in the window. Formally, let where and . Since , then for each node , the first derivative of can increase only at points . Similarly, the first derivative of can decrease only at points such that and . Therefore can be defined as follows.
(20) 
The computation of for each (with ) makes it easy to assess an upperbound on the interference it will induce on the workload of the lower priority tasks in any given time window. From [2], it has been proven that every unit of execution of a LL task can interfere for at most units on the workload request of any other LL task with a lower priority. Thus, an upperbound on the interference suffered by the task in a window of size is provided as:
(21) 
where is the set of tasks with a higher priority than . A sufficient schedulability condition for a taskset is derived from Eq. (21) as follows.
Theorem 1 (Sufficient schedulability condition).
A taskset is schedulable on a homogeneous multicores using a scheduler if:
(22) 
where is computed by the following fixedpoint algorithm.
Note: This iterative algorithm stops as soon as for any , or . In the latter case, is deemed not schedulable.
V Reduction of the number of tasks with carryin
Rather than considering that each task has a carryin as in Section IV, the intuitive idea of this section consists of reducing the number of tasks with carryin to at most () tasks (where is the number or cores). Since it is usually the case that , we thus obtain a tighter upperbound on the interference that each task may suffer at runtime and finally a better schedulability condition for each task. To accomplish this, first let us recall some fondamental results regarding the “Liu & Layland (LL) task model”.
Upperbound on the workload request of a LL task without carryin. Let be a LL task with no pending workload at the beginning of a window of length . An upperbound on its workload request in this window is recalled (see [3, 7]):
(23) 
Upperbound on the workload request of a LL task with carryin. Let be a LL task with some pending workload at the beginning of a window of length . An upperbound on its workload request in this window is recalled (see [3, 7]):
(24) 
Extra workload request of a LL task. The difference between the upperbounds –with and without– carryin of a LL Task in a window of length is thus recalled as:
(25) 
Upperbound on the interference of a LL task. Assume a scheduler and a taskset in which tasks are in a decreasing priority order. An upperbound on the interference that higher priority tasks induce on the execution of task in a targeted window of length is recalled (see [3, 7]):
(26) 
In Eq. (26), returns the greatest value among the workload of tasks with a higher priority than . For a LL taskset, it has been proven in [7] that a worstcase scenario in terms of total workload request in a targeted window of length can be constructed by considering () tasks with carryin. Therefore, it follows that the workload induced by these carryin tasks in this window of concern cannot exceed the difference between () the maximum workload assuming no carryin for all tasks (see Eq. (23)) and () the workload assuming the carryin scenario (see Eq. (24)). Consequently, from the viewpoint of task , if , then does not suffer any interference, otherwise, if , then we can choose the () tasks among such that the difference between the workload assuming the noncarryin scenario and the workload assuming the carryin scenario is the largest possible for each selected task. By summing up these differences and the remaining “” workloads corresponding to the tasks without carryin, an upperbound on the workload that higher priority tasks induce in the window of length is computed.
Before we extend Eq. (26) to the scheduling problem of tasks using a scheduler, let us present an alternative formal proof to the one provided by Guan et al. [7] for the analysis considering () tasks with carryin.
Theorem 2 (Eq. (26) is an Upperbound for LL tasks [7]).
Let be a feasible LL taskset scheduled by using a scheduler on a homogeneous multicores. Let task . Eq. (26) is an upperbound on the interference on in any window of length .
Proof.
Since is feasible, let be the latest timeinstant such that at least one core is idle at time , then at most () tasks have a carryin workload at time instant . Let be the window of length starting at . By considering the () tasks with the largest possible carryin, we are conservative w.r.t. the workload request of the tasks with carryin in . In the same vein, by considering () tasks without carryin to be simultaneously released at time with the future activations of each of these tasks to occur as soon as it is legally permitted to do so, we are also conservative w.r.t. the workload request of the tasks without carryin in .
Now let be a window of length starting at time with the offset . Assume that the beginning of triggers the first activation of . The earliest timeinstant at which may start executing is at such that . Indeed: () (as cannot start executing before its activation time), and () (as all the cores are busy executing higher priority tasks between and ), by construction. As all the cores are busy executing higher priority tasks between and , getting the first activation of at any timeinstant in the interval (i.e., by sliding towards ), we can only increase the interference on (as the end of the execution of remains unchanged). The maximum interference is obtained when is activated simultaneously with all higher priority tasks, i.e., at time as then we have the largest possible carryin as well as noncarryin interference on the execution of . The theorem follows. ∎
Vi Extension to based Tasks
In this section we extend the reduction of the number of tasks with carryin obtained in the framework of LL tasks to the task model. To accomplish this end, we distinguish between the upperbound on the workload request of the tasks with carryin (see Eq. (19)) and without carryin (which is detailed hereafter). These expressions will be considered when computing the interference of higher priority tasks on the execution of every task in a window of length .
Upperbound on the workload request of tasks without carryin. Let us assume that is a task without carryin. An upperbound on its workload request in a targeted window of length can be constructed by distinguishing between the same two scenarios as those which allowed us to derive Eq. (19) in Section IV.
Regarding Scenario () where does not suffer any interference from higher priority tasks, an upperbound on the workload request of at any time is defined as follows.
(27) 
In a similar manner, regarding Scenario () where suffers the maximum interference from higher priority tasks, an lowerbound on the workload request of at any time is defined as follows.
(28) 
As for the carryin tasks case, Eq. (27) and Eq. (28) can be used to obtain an upperbound on the workload request of in a time window of length as claimed in Lemma 3.
Lemma 3 (Upperbound on the Workload of Without Carryin).
Assiming no carryin of task , an upperbound on its workload request in a window of length is given by:
(29) 
Proof Sketch.
The proof sketch of this lemma follows the same reasoning as that of Lemma 2. ∎
From Lemma 2 and Lemma 3, it follows that the difference between the upperbounds –with and without– carryin for a task in a window of length is can be written as:
(30) 
All the results presented so far enable us to present a tighter upperbound on the interference of a task together with the corresponding sufficient schedulability condition.
Tighter Upperbound on the Interference of a Task. Assume a scheduler and a taskset in which tasks are in a decreasing priority order as in Section V. An upperbound on the interference that higher priority tasks induce on the execution of task in a targeted window of length is obtained as follows.
(31) 
Each term in Eq. (31) is explained as the corresponding term in Eq. (26) and a tighter schedulability test follows.
Theorem 3 (Tighter Sufficient Schedulability Condition).
A taskset is schedulable on a homogeneous multicores using a scheduler if:
(32) 
where is computed by the following fixedpoint algorithm.
Note: This algorithm also stops as soon as for any , or . Again, in the latter case, is deemed not schedulable.
Proof.
The proof of this theorem is similar to that of Theorem 2. The difference here resides in the evaluation of the upperbound on the workload of tasks without carryin. Instead of considering a synchronous activation at these tasks at the beginning of the targeted window and assume their subsequent activations to occur as soon as it is legally permitted to do so, the upperbound has to be computed by using Eq. 29). ∎
Vii conclusions
In this paper, a sufficient schedulability test for fully preemptive based tasks with constrained deadlines is presented. A global fixed task priority () scheduler and a homogeneous multicore platform are assumed. Under these settings, this work is the first to address this problem to the best of our knowledge. As future work we intend to evaluate the properties of a task model where nodes belonging to each task may execute with different priorities rather than directly inheriting their priority from the task they belong to.
References
 [1] B. Andersson and D. Niz. Analyzing globaledf for multiprocessor scheduling of parallel tasks. In OPODIS, 2012.
 [2] T. Baker and M. Cirinei. A unified analysis of global edf and fixedtaskpriority schedulability of sporadic task systems on multiprocessors. Journal of Embedded Computing, 4(2):55–69, 2010.
 [3] M. Bertogna and M. Cirinei. Responsetime analysis for globally scheduled symmetric multiprocessor platforms. In RTSS, 2007.
 [4] V. Bonifaci, A. MarchettiSpaccamela, S. Stiller, and A. Wiese. Feasibility analysis in the sporadic DAG task model. In ECRTS, 2013.
 [5] H. Sung Chwa, J. Lee, K. Phan, A. Easwaran, and I. Shin. Global edf schedulability analysis for synchronous parallel tasks on multicore platforms. In ECRTS, 2013.
 [6] F. Fauberteau, M. Qamhieh, and S. Midonnet. Partitioned scheduling of parallel realtime tasks on multiprocessor systems. In WIP ECRTS, 2011.
 [7] Nan Guan, Martin Stigge, Wang Yi, and Ge Yu. New response time bounds for fixed priority multiprocessor scheduling. In RTSS, 2009.
 [8] K. Lakshmanan, S. Kato, and R. Rajkumar. Scheduling parallel realtime tasks on multicore processors. In RTSS, 2010.
 [9] J. Li, K. Agrawal, C. Lu, and C. Gill. Analysis of global edf for parallel tasks. In ECRTS, 2012.
 [10] J. Marinho, V. Nélis, S. Petters, and I. Puaut. Preemption delay analysis for floating nonpreemptive region scheduling. In DATE, 2012.

[11]
Wind River. VxWorks Platforms.
http://www.windriver.com/products/
productnotes/PN_VE_6_9_Platform_0311.pdf.  [12] A. Saifullah, K. Agrawal, C. Lu, and C. Gill. Multicore realtime scheduling for generalized parallel task models. In RTSS, 2010.
Comments
There are no comments yet.