When you have eliminated the impossible, whatever remains, however improbable, must be the truth
Priority Inheritance  is a widely used protocol for real-time applications involving shared resources with a huge practical and theoretical impact. Its adoption is pervasive in the control and automation industry and in all other domains that rely on real-time systems .
The purpose of priority inheritance is to prevent unbounded priority inversion. With respect to other, more efficient protocols proposed in the last years to address the same problem, priority inheritance has a great advantage in its transparency, in the sense that its implementation does not require any information on the tasks involved. It offers, however, a significant drawback, in that there are no known exact methods for computing the blocking time, and the only known method for bounding the blocking time is of exponential complexity .
Blocking time is an essential element in feasibility analysis, which is one key theoretical and practical aspect of real-time systems. While blocking time computation can be done exactly, efficiently and straightforwardly under many other resource access protocols , under priority inheritance even bounding the blocking time is nontrivial, because there are many possible causes of blocking, and jobs can be blocked multiple times, a phenomenon called chained blocking. The problem becomes particularly intricate when jobs are allowed to hold multiple resources at a time.
In this article we propose a polynomial method for bounding the blocking time, and an exact, optimally efficient method for blocking time computation under priority inheritance that applies without restrictions on the number of resources each job can hold.
We draw from results in operations research and artificial intelligence. In particular, we show how the bounding problem can be mapped onto an assignment problem, which is a well-studied problem in operations research. Then we define blocking time computation as a search problem in the space of possible assignments of resources, where the objective is to find the path that induces the worst-case scenario associated with the maximum blocking time. Search can also be seen as a process aimed to eliminate impossible resource assignments, corresponding to inadmissible paths. To that end, we provide a full characterization of the conditions that must be met in order for a resource assignment to be admissible. Moreover, we show that the polynomial bound can be used as an admissible heuristics in the search process. As a consequence, the search method we propose is bothexact and maximally efficient, in the sense that it does not explore branches unnecessarily.
We build on work by Sha, Rajkumar and Lehoczky , who proposed and studied two priority inheritance protocols: the “basic” Priority Inheritance Protocol (PIP), and the Priority Ceiling Protocol (PCP) as a solution to unbounded priority inversion . Blocking time is an essential element in feasibility analysis under resource constraints. While PCP’s blocking time is perfectly understood, and its computation straightforward, with PIP instead literature only provides upper bounds . One such upper bound was proposed by Rajkumar . However, using an upper bound for feasibility analysis may be unnecessarily conservative and result in failure to identify perfectly feasible applications with an arbitrarily small processor utilization.
The following example introduces an application where the upper bound results in an overly conservative blocking time estimation.
Consider a job with priority , which uses resources , and a set of jobs, , with priority , which also use the same resources. Let the resource associated with a critical section be (all jobs access resources in the same order). Finally, let the duration of each critical section be:
for , for all (i.e., all the sections in the antidiagonal), and
an arbitrarily small in all other cases
as illustrated in Figure 1. With this set up, the upper bound obtained by applying Rajkumar’s method on ’s blocking time would be . However, because of the reasons we will discuss in the next sections, the exact is only , if is odd, or an even smaller
is odd, or an even smaller, if is even. Since can be arbitrarily small, the exact value is times smaller than the estimated bound, with self-evident implications on feasibility analysis.
This apparent shortcoming of current feasibility analysis methods and the pervasive use of PIP motivates us to devise an exact procedure for blocking time computation under PIP. In order to do that, we start by introducing notation, definitions, as well as scheduling model used in literature [1, 3]. For ease of reference we summarize the notation in Table I.
A job is a sequence of instructions that will continuously use the processor until its completion if it is executing alone on the processor. That is, we assume that jobs do not suspend themselves, say for I/O operations.
A periodic task is a sequence of jobs of the same type occurring at regular intervals. denotes a job, i.e., an instance of a task . Each task is assigned a fixed priority, and every job of the same task is initially assigned that task’s priority. denotes ’s priority. We assume that jobs are listed in descending order of priority with having the highest priority, .
If several jobs are eligible to run, the highest-priority job will be run. Jobs with the same priority are executed in FCFS discipline. When a job is forced to wait for the execution of lower-priority jobs, is said to be blocked.
A binary semaphore guarding a shared resource is denoted by , usually with a subscript, and it provides the and indivisible operations. The -th critical section in is denoted by and corresponds to the code segment of between the -th operation and its corresponding operation. The semaphore that is locked and released by is denoted by . The resource guarded by is denoted by . The duration of a critical section , denoted , is the time to execute when executes on the processor alone. A job is said to be blocked by the critical section of job if and has to wait for to exit in order to continue execution. The sequence of all critical sections of a job is denoted by .
|-th periodic task|
|the priority associated with|
|-th job: an instance of|
|a set of jobs (application)|
|the set of all jobs in that can block|
|the set of jobs that can block when jobs can hold more than one resource at a time|
|-th critical section of , corresponding to the code segment of between the -th wait operation and its corresponding signal operation|
|is entirely contained in|
|the sequence of all critical sections of a job :|
|maximal sequence with respect to :|
|a chain of critical sections, or z-chain:|
|the semaphore associated with|
|the resource guarded by|
|the set of all resources whose semaphores can block when each job can hold at most one resource at a time|
|the set of all resources used by jobs in|
|the set of all resources whose semaphores can block when jobs can hold more than one resource at a time|
|set induced by () from|
|set induced by ()|
As in , we use a simplified scheduling model, as defined by the following assumptions.
All the tasks are periodic.
Each job in a periodic task has deterministic execution times for both its critical and noncritical sections and it does not synchronize with external events, i.e., a job will execute to its completion when it is the only job in the system.
The last assumption implies that the sequence of operations on semaphores by each individual job is known, and that the worst-case execution time of each critical section is also known.222Blocking time analysis typically considers only the longest critical sections . However, an exact computation of the worst-case blocking time under PIP requires more information. In particular, we will describe each job by the sequence and length of its critical sections.
Current work on blocking time analysis under PIP typically assumes that a job can hold only a resource at a time. We instead accept that jobs can hold multiple shared resources at the same time. However, following a well-established convention , we assume proper nesting of critical sections. We shall write , or equivalently , if a critical section is entirely contained in .
We assume that critical sections are properly nested. That is, given any pair of critical sections and , if , then either , or . Moreover, we assume that a semaphore may be locked at most once in a single nested critical section, so .
Finally, we assume that resources are properly released.
Each job releases before terminating any resource it holds.
When convenient, we will use square brackets to denote critical sections, indicating in the brackets the name of the associated resources and the duration of the section.
The following notation:
describes a set of two jobs: with two critical sections, and , and with two critical sections, and . The duration of is , and the resource associated with is , guarded by semaphore . is entirely contained in , whereas follows .
We will call an ordered sequence of critical sections a z-chain, denoted as . The duration of a z-chain, denoted , is the sum of the durations of its elements:
In this section we will identify and define all the elements that are necessary for an analysis of the blocking time computation under PIP.
Consider an application and a set of resources , each guarded by a distinct binary semaphore.
It is a known fact that, if each job can hold at most one resource at a time, includes all and only the resources used both by jobs with priority lower than and by jobs with priority higher than or equal to . We will use to denote the set of resources whose semaphores can cause blocking to if each job can hold at most one resource a time:
Accordingly, we will use to denote the set of all jobs that can block , if each job can hold at most one resource at a time. In particular, includes all and only the jobs with priority lower than that use resources belonging to :
The fact that critical sections can be nested, properly or otherwise, introduces the threat of deadlock.333Deadlock is not an issue when all sections are disjoint, because a deadlock requires the occurrence of the hold-and-wait condition , which cannot occur if all sections are disjoint. Clearly, deadlocks must be prevented in real-time applications. A common way to do so is by preventing a necessary condition for deadlock, known as circular wait, in particular by imposing a strict order on resource acquisitions. Checking that a given application respects such a strict order is trivial.444One possibility is to map resources onto vertices of a directed graph, and the “entirely contained” relation onto edges between vertices. Then one can use a linear-time method such as Tarjan’s strongly connected components algorithm  to verify that the graph has no strongly connected subgraphs with more than one vertex, i.e., the graph is a directed acyclic graph. If the graph is acyclic, deadlock cannot occur. We will thus assume that deadlock is prevented by some external means, and in particular that semaphores are accessed in an order consistent with a predefined acyclical order :
We assume that the relation defined over nested critical sections induces a partial order over resources.
Nesting also introduces a new phenomenon, called transitive priority inheritance . In particular, if a job is blocked by a job , and is blocked by a third job , then inherits ’s priority via .555It is well-known that transitive priority inheritance is only possible in the presence of nested sections.
An effect of transitive priority inheritance is the extension of the set of resources that can cause blocking to . In the absence of nested sections, when each job can hold at most one resource at a time, a resource can block only if its ceiling is at least , and it is used by a job with a priority lower than . This no longer holds. In the presence of nested sections, because of transitive inheritance, a job can inherit a priority higher than that of the job it’s blocking. Therefore, a resource can cause blocking to even if its ceiling is lower than , but higher than or equal to the priority of the jobs that can inherit a priority greater than or equal to . The set of jobs that can block is thus, in general, a superset of .
Let us consider . Let jobs in access a set of shared resources , in the following way:
These jobs define the following sequences of critical sections: , , , and . We observe that , , and , which together with the fact that , , , , and , induces a resource ordering , thus is deadlock-free.
We have and , so if the critical sections were all disjoint, could not possibly cause blocking to , and we would have .
However, let us consider the sequence of events illustrated in Figure 2, where is released as soon as acquires and enters , is released as soon as acquires and enters , and finally is released as soon as acquires and executes . In that case, as soon as attempts to acquire (the semaphore guarding as well as ), will be blocked for the duration of the whole z-chain , that is, for 11 units of time. Interestingly, involves sections that are not directly associated with and : (not in ) has a section that belongs to , , which uses , also not in ; however, contributes to blocking because and , and in turn and , with, finally, . In the end, the set of resources that cause blocking to in this example is , and the set of jobs that block is .
The example above motivates the introduction of the set , which includes all and only the resources whose semaphores can cause blocking to when nested sections are allowed. Accordingly, denotes the set of jobs that can block when nested sections are allowed.
In particular, includes all and only the resources used both by jobs with priority lower than , and by jobs that have or can inherit a priority equal to or greater than (due to transitive priority inheritance). In order to characterize and we need to delve a bit deeper into such a phenomenon.
Transitive priority inheritance requires three distinct jobs, , , and . If these are the only jobs, then in order for to inherit through , the following conditions must hold: (1) defines two critical sections, and , such that , (2) is shared with and (3) is shared with .
More in general, we can say that a job can cause blocking to either because, independently of nested sections, , or because the following conditions hold: (1) a third job , with priority lower than , defines two critical sections, and , such that , (2) the resource associated with the outer section, , is a resource that can cause blocking to , and (3) defines a critical section that uses . Under such conditions, can cause blocking to . Notice that the blocking in question does not depend on and ’s relative priority, as long as ’s priority is lower than , and is other than . We then obtain the following characterization:
Accordingly, includes all and only the jobs with priority lower than , that use resources belonging to :
Example 4 (continued from 3)
We have , , , , , and .
defines the resources that in principle could block . However, blocking depends on the schedule, and not all schedules are possible. To illustrate, consider the following example.
Example 5 (continued from 4)
From previous analysis we know that corresponds to a possible schedule (illustrated in Figure 2), yielding an overall blocking time for of 11 time units. corresponds to the following allocation of resources in to jobs in : , , and . Let us now consider a different z-chain , also involving three different resources/jobs in /: , yielding a total duration . corresponds to the following allocation of resources in to jobs in : , , and . The jobs and resources are the same as before, but unlike , describes an impossible schedule. Indeed, may not obtain access to while holds , because in order to reach , should cross , meaning acquiring (and then releasing) .
Moreover, if we consider other possible allocations that could cause blocking to , we notice that each allocation where holds would inhibit any possible contribution of , , and towards blocking . As a matter of fact, and belong to only by virtue of potentially holding , and belongs to only by virtue of potentially holding even as holds .
As a result, the only possible allocation where all the resources in play a role towards is that corresponding to in Example 3. Another possible allocation of resources yielding the same duration would be , , and in that case may not hold any resource. Other possible allocations result in shorter z-chains, therefore the duration of the longest z-chain for this application, corresponding to a possible schedule, is units.
In general, whether a resource may or may not belong to a z-chain corresponding to an admissible schedule depends on the other resources in the same z-chain. We shall thus introduce the notion of a induced resource set. This will enable us define an iterative characterization of equivalent to the recursive one given earlier. The idea is to obtain by initially computing and then iteratively applying the definition of induced set until a fix point is reached. But before we go there, we need to introduce the notion of maximality with respect to a set of resource.
Definition 1 (Maximal section)
Given a set of resources, a section is maximal with respect to if and only if and .
Definition 2 (Maximal sequence)
Given a set of resources and a sequence , the corresponding maximal sequence with respect to , denoted , is the sequence of sections in that are maximal with respect to : .
Definition 3 (Induced set)
Let be a set of resources, a job, and a maximal section with respect to , for some . The set induced by () from , denoted , is the set of resources that (1) are associated with a critical section contained in , (2) do not belong to , and (3) are associated with a critical section belonging to a job other than and with a priority lower than :
Example 6 (continued from 5)
Consider , , and , which is maximal with respect to . We have . Indeed, if enters while or are held by other jobs, will not be able to complete its execution of and thus release until it can get hold of and as well.
Induced sets can be used to compute . The straightforward way to do that is to initially set and then apply the induction operator until a fix point is reached. Such a method, encoded by function Relevant-Resources in Figure 3, will necessarily reach a fix point, because is a monotonically growing set of resources, and resources are finite. Moreover, its complexity is bound by the number of resources outside of times the number of critical sections in jobs with a priority lower than .
Example 7 (continued from 6)
. . . Maximal sections of , , and with respect to : , , , , and . . . . (fix point).
Definition 3 applies a single section, but we can extend it to z-chains.
Let be a z-chain of sections that can cause blocking to . The set induced by (), denoted , is defined as .
We are now ready to characterize all the possible cases of blocking using the notion of admissibility and its necessary condition, induction compatibility. Intuitively, a z-chain is induction compatible if each resource associated to sections in contributes to blocking , given the other elements in , whereas it is admissible if it is induction compatible and corresponds to a possible schedule. In that case, describes a possible sequence of job activations leading to a situation where at a given time each relevant job executes inside its corresponding section in , whereby the total blocking is subject to is . If, otherwise, is inadmissible, cannot cause a blocking , because it is impossible to schedule jobs so as to have at any given time all relevant job executing inside their corresponding section in .
Definition 5 (Induction compatibility)
Consider a job and a z-chain of sections belonging to all-different tasks and associated with all-different resources. Then a section is induction compatible if either or for some such that , and is induction compatible.
Example 8 (continued from 7)
Consider from Example 5. is induction compatible because , while and are induction compatible because there are two sections contained in and associated with and . However, as we know from Example 5, models an impossible schedule. Consider now , which represents a perfectly possible job scheduling, where has reached and is holding and . alone is not induction compatible, because and there is no other induction compatible section in which contains a section associated with . Indeed, there is no reason why should cause any blocking to .
Admissibility uses and extends induction compatibility by laying out all the constraints that must be satisfied in order for a z-chain of duration to cause a blocking to a job . Admissibility is defined by induction. Figure 4 is meant as a reference to clarify the notation used in some constraints (FHO and FLO).
Definition 6 (Admissibility)
Admissibility is defined with respect to a job by induction:
The empty chain is admissible with respect to any job .
A non-empty z-chain is admissible with respect to a job if and only if is admissible and is an admissible extension to with respect to .
A section is an admissible extension to with respect to if an only if it satisfies all the following conditions:
(Novelty of Blocking Job): is a new job: ;
(Novelty of Blocking Resource): is associated with a new resource:
(Limited-Scope Maximality): is maximal with respect to :
(Freedom from Higher-priority job Obstruction): is not associated with, or contained in a section associated with, a section of a higher priority job that precedes a section :
(Freedom from Lower-priority job Obstruction): is not preceded by a section associated with a resource associated with a section , or with a section containing a section , of a lower priority job :
These definitions provide a complete characterization of the conditions for blocking in the absence of nested sections. In particular, NBJ and NBR are known from literature : it should be self-evident that in order for to be blocked by two different critical sections of tasks in , these critical section must refer to different resources and belong to different jobs.
LSM instead reflects the following observation: if already contains a section associated with a resource , then any other section contained in a section associated with cannot be an admissible extension to , since that section could not possibly be reached (the job would be blocked before). Therefore, we are only interested in maximal sections. Moreover, only considering resources belonging to the set induced by ensures induction compatibility. Thus limited scope maximality–the scope being limited to –rather than just maximality.
Finally, FHO an FLO are reachability conditions. On the one hand, the sections that already belong to should remain reachable, therefore new sections that extend should not obstruct them. On the other hand, these new extensions to must themselves be reachable. Notice that, because jobs can hold multiple resources at the same time, a resource can be either directly associated with a section , or it can be associated with a section that contains , and will thus be allocated to the job that executes . In particular, FHO stipulates that if a section is added to a chain that contains higher-priority sections, the latter must still be reachable, whereas FLO stipulates that must itself be reachable in spite of lower-priority sections that may already be in . Reachability is obstructed by sections in the higher-priority job that precede the higher-priority section and are associated with resources that are also associated with the lower-priority section, directly or otherwise.
It is worthwhile noticing that any 1-element z-chain is admissible if and only if its element is maximal with respect to . Any z-chain composed of first-only sections () satisfies FHO and FLO (as well as, trivially, NBJ), and is therefore admissible if and only if it satisfies NBR and is induction-compatible.
The bottom line
The model we introduce provides a complete characterization of the sequences of critical sections that can block a job. Given such a model, we propose the following methodology for computing the blocking time:
Because nested sections under PIP introduce the risk of deadlock, the first step is to establish that semaphores are accessed in an order consistent with a predefined acyclical order. If that is not the case, the blocking time is infinity. This can be done in linear time.
If there is no risk of deadlock, one proceeds to determine an upper bound. This, as we will see, can be done in polynomial time.
Next, one verifies that the upper bound found in the previous step corresponds to an admissible z-chain. If that is the case, the upper bound corresponds to the exact value. This verification procedure can also be carried out in polynomial time.
Finally, if the previous steps fail, one needs to search for an admissible resource allocation yielding the maximum blocking time. To that end, one could explore the space of admissible allocations using heuristic-based tree-search, which is a complete method able to compute the blocking time exactly, as well as to provide a proof, in the format of a z-chain.
In , Rajkumar proposes a branch-and-bound search technique to determine an upper bound on the blocking delay of each job under PIP, assuming that each job can hold at most one resource at a time. The method consists in summing the durations of the longest critical sections of jobs that can block , with the restriction that all jobs must be different and the critical sections must be associated with all different semaphores.
Such an approach has three main limitations:
it has an exponential complexity,
it only applies in the absence of nested sections, and
it is not an exact method, as it only provides an upper bound.
In this section we address the first two limitations, by showing how the same upper bound can be computed using a polynomial complexity algorithm, called the Hungarian method [6, 7], and that such a method does not depend on the number of resources a job can hold at a time.
The Hungarian method is a combinatorial optimization algorithm that solves the assignment problem in polynomial time. Assignment is a minimization problem, described as finding an optimal assignment of tasks to workers, based on a square cost matrix. The problem we address can be considered as an assignment problem’s dual, where “tasks” are resources to be assigned to jobs (the “workers”), “costs” are defined by longest durations, and the objective is to maximize (as opposed to minimize, hence the “dual” problem) the total time spent by the jobs on these resources.
The method we propose consists in casting the problem into an assignment problem’s dual, and then applying the Hungarian method, which we can always do as long as we express the input data in the form of a square cost matrix.
Our algorithm for determining Rajkumar’s upper bound using the Hungarian method is shown in Figure 5. For generality, the algorithm is expressed as a function H with two arguments: a generic set of jobs, , and a generic set of resources, . When invoked with arguments and , H returns Rajkumar’s upper bound for ’s blocking time, . Moreover, when invoked with arguments and it provides an upper bound for ’s blocking time when each job can hold multiple resources at a time. Finally, in the next section we will see that H also serves a key purpose in the exact computation of ’s maximum blocking time, when applied to subsets of and . Hence our presentation of the algorithm in its parametric format.
Function H relies on two matrices as its main data structures: a blocking time matrix, denoted , whose cells contain the longest durations of critical sections, and an cost matrix constructed from , with .
For better readability, references to matrix elements are made via their corresponding jobs/resources. So identifies the matrix element corresponding to job and resource . The value of such an element in is denoted by , in by . Moreover, denotes the index of the column of corresponding to and the index of the row of corresponding to .
In order to obtain a square cost matrix, is first filled with the opposite of the homologous values in , increased by a constant to ensure that
only contains positive values, and then padded with 0 rows or columns.
Once is set up, the Hungarian method is described by the following four steps:
Step 1. Subtract the smallest element in each row from all the elements of its row. Each row will contain at least one 0 element and no negative element.
Step 2. Subtract the smallest element in each column from all the elements of its column. Each column will contain at least one 0 element and no negative element.
Step 3. Check if an assignment is possible. An assignment is possible if and only if there is a collection of 0 elements in distinct rows and distinct columns. One way to check that is to proceed row by row selecting the row with the least number of 0 elements and mark the (unmarked) column intersecting the first 0 element of that row. If finding a 0 element in each row by only looking at unmarked columns is possible for every row, then the assignment is possible, and the return value, , is computed as the sum of the values of elements in the matrix corresponding to the 0 elements found in the matrix.
Step 4. If no assignment is possible, transform and go back to Step 1. To transform , first find a minimum set of rows and columns that covers all the 0s in . This can be done by applying the method described by Munkres in , not shown here, in the interest of brevity. Notice that, because no assignment is possible, . Then, let be the smallest entry in outside of the rows/columns in . Subtract from each element in outside of the rows/columns in , and add to each element in that sits at the intersection of rows/columns in .
The computational complexity of the Hungarian method is , which is much smaller than the complexity of the straightforward attack on the problem . Notice that, alongside with computing , H also constructs a set of job/resource pairs , which will be needed later for check admissibility (see Section V).
In the examples that follow, we will use square brackets to signify relevant sections, with an indication of the associated resource and duration. For instance, with reference to a job , the expression denotes a sequence of two critical sections , where , and .
Let us consider a set