The ever-increasing demand for communication has resulted in unprecedented need for data transfer in essentially all settings, from local datacenters to planetary-scale WANs. A central challenge for network operators is to accommodate as much traffic as possible and to finish data transfers as quickly as possible. In order to make networks even more efficient, new technologies have been developed that allow for software-reconfigurable networks (usually just called reconfigurable networks). These technologies essentially allow software control over the network topology, rather than just over traditional control problems such as routing, scheduling, congestion control, etc. In other words, we are now able to dynamically reconfigure the network topology to respond to network demands in an online fashion.
There has been a significant amount of work on actually building these technologies and systems: see  for such a system for optical WANs, and see [13, 14, 25, 19, 9] for a small sample of reconfigurable datacenter networks (a survey of reconfigurable datacenters can be found at ). However, there has been less attention paid to the algorithmic problems raised by these technologies: we have the ability to dynamically reconfigure the network topology, but what should we reconfigure it to
? How should we react to changing transfer and traffic demands? Most systems use a variety of heuristics, ranging from matching-based algorithms (maximum or stable) to simulated annealing.
The theoretical study of the algorithmic challenges arising from reconfigurable networks (particularly optical WANs) was recently initiated by Jia et al. , and this remains the state of the art on the theory of scheduling reconfigurable networks. In their setting, they assume a centralized controller that can dynamically reconfigure the network topology, with the only restriction being a degree constraint at every node (which could be different for different nodes, depending on the underlying machine represented by the node)111Clearly this is not a fully realistic setting, as in optical WANs there are optical restrictions on the topology which need to be accounted for and in the datacenter setting there is still an underlying fixed network in addition to some reconfigurable links. But as discussed in , it is a reasonable starting point for developing algorithms.. There is a stream of transfer requests arriving at the system, where each request has a source, a destination, a transfer size, and a release time (the earliest time by which the transfer can start). The goal is to design a scheduling algorithm that decides, at each time slot, what topology to build and what jobs to transfer using that topology (under the additional restriction that multihop paths are not allowed). They provided both offline and (more interestingly) online algorithm for the makespan objective (minimizing the time at which all transfers are finished) and the sum of completion times (minimizing the sum over all jobs of the time at which they finished).
We work in the same model, but extend and improve the results of . Most importantly, we provide online algorithms and prove their competitive ratio for a more natural objective function: the (weighted) sum of flow times. The flow time of a job (also sometimes called the sojourn time, waiting time, or response time) is simply the time that it is in the system, i.e., its completion time minus its release time. If all release times are , then flow times and completion times are the same. But if jobs are released online, then not only are they extremely different, but moreover approximation guarantees on the completion times are not particularly meaningful. While both problems have the same optimal solution, in an approximation analysis one can make a job wait proportional to its release date with little penalty. When the time horizon is large, then undesirable schedulers can have a small (e.g. constant) approximation ratio.
Completion Time Versus Flow Time. To see the difference between completion time and flow time consider an extremely simple example, suppose that there are only two jobs, each of which has size . Job is released at time , and job is released at time . Then consider the schedule which which schedules job at time and job at time . Clearly this is an undesirable schedule – we should have scheduled job at time and job at time . But if we look at the sum of completion times, the optimal solution has cost , while this horrible schedule has cost . So this horrible schedule looks pretty good with respect to completion times, since it is a -approximation! This is clearly ridiculous; we “cheated” by allowing job to have terrible performance but it balanced out with job ’s release date. On the other hand, if we look at the sum of flow times, the horrible schedule has cost (since job is in the system for time units while job is only in the system for time unit) while the optimal solution has cost (since jobs do not have to wait to be scheduled). Thus, the flow time objective will rule out such a schedule and accurately reflects the quality of a schedule.
Results: In this paper we initiate the study of reconfigurable network scheduling under the weighted flow time objective. In more detail, we prove the following results.
In the most general setting of  (arbitrary degree constraints, arbitrary job sizes, arbitrary release dates), and in addition where every job has a weight which multiplies its flow time in the objective, we give an algorithm which is -competitive as long as the algorithm is allowed to have speed , for any . Informally, this is a form of resource augmentation: we allow the algorithm to complete jobs at a rate that is faster than the optimal solution is allowed. From a networking perspective, this is equivalent to allowing higher throughput edges as resource augmentation. So, for example, our algorithm will have weighted flow time with Gbps links that is only times worse than the optimal solution with Gbps links. This can also be thought of as overprovisioning: if we want performance that is comparable to the optimum but without knowing in advance what the jobs will look like, then we can just overprovision by a factor.
We justify our previous results by showing that speed augmentation is necessary: we prove a polynomial lower bound on any online algorithm without speed augmentation. In particular, we prove that any online randomized algorithm without speed augmentation can have competitive ratio that is at best . This is a terrible lower bound, showing that without resource augmentation all algorithms perform poorly in the worst case. In settings like this, resource augmentation has been used so theory can differentiate between the performance of algorithms .
As a side effect of our techniques, we are also able to extend the results of  on completion times to a more general setting. While this work provided many algorithms and -competitive analyses, they did not give an -competitive algorithm for the most general case: general degree constraints, general job sizes, and general release times. They also did not give bounds on weighted completion times. A simple modification of our flow time algorithm gives an approximation without speed augmentation for the completion time objective in the most general setting.
Outline. In Section II we describe related work for both reconfigurable networks and flow time scheduling in other settings. In Section III we formally describe the problem setting. Section IV has our main upper bounds. We begin with a warm-up in Section IV-A where we assume that all weights are , all job sizes are , and all degree bounds are the same. This simplified setting allows us to demonstrate the intuition behind our more general techniques. We then prove give our algorithm and analysis for the general setting in Section IV-B, and show how this can be modified to give a bound on completion times in Section IV-C. Finally, in Section V we prove our lower bound implying that speed augmentation is necessary.
Ii Related Work
Ii-a Reconfigurable Networks
As discussed in the introduction, there has been a significant amount of work in the last decade on reconfigurable datacenters. For overviews, see a recent tutorial from SIGMETRICS 2019  and the related survey on reconfigurable datacenters . These have been enabled by a variety of technologies, including optical circuit switching [9, 23], 60GHz wireless , and free space optics [14, 13].
From an algorithmic point of view, these systems generally use a variety of heuristics without provable guarantees. The main line of work on understanding the theory behind reconfigurable datacenters is in the form of demand-aware networks [5, 4, 2, 3]. In this setting, we assume that we are given a traffic matrix, and are trying to design a network topology which will have good performance on that traffic matrix (i.e., since the network is reconfigurable we can measure demand and then build an appropriate network topology). Usually the notion of quality involves the (average) lengths of paths. Scheduling problems are not considered in this setting.
For non-datacenter contexts, reconfigurable optical WANs were introduced by . The scheduling algorithms used in  were based on heuristics (simulated annealing in particular), so in followup work, Jia et al.  introduced the theoretical study of scheduling algorithms for reconfigurable optical WANs. They worked in a model which is not a perfect match for optical WANs, but is close enough to be useful. We adopt this model, and extend  to a better objective function and slightly more general setting. Moreover, since their model ignores many of the real-world difficulties of optical WANs, it applies to more general reconfigurable networking settings.
We note that while WANs and datacenters are obviously extremely different settings, our goal is to understand the scheduling problems that arise from the power of reconfiguration. Hence we abstract out to a level which encompasses both of these settings, at the price of not being extremely realistic for either of them. However, this is the level of abstraction used in , so it is perhaps a reasonable setting for optical WANs. For datacenters, the main difference between our model and reality is the existence of an underlying fixed network: in our model we assume that the entire network is reconfigurable, while in most reconfigurable datacenter systems only a fraction of the links can be reconfigured. Analyzing this combined setting is an interesting future line of research, which was recently initiated in the context of routing [10, 11] but which is still entirely unexplored for scheduling.
Ii-B Flow Time Scheduling
Optimizing total weighted flow time is the most popular objective in online scheduling theory. We discuss related work on the problem of scheduling preemptive jobs that arrive over time on a single machine with the objective of optimizing the total weighted flow time. For a (slightly dated) survey see , and further pointers to relevant work can be found in . It is folklore that the algorithm Shortest-Remaining-Processing-Time (SRPT) is optimal for scheduling unweighted jobs on a single machine. When jobs have weights, it is known that no online algorithm can have a constant competitive ratio .
When there are non-constant lower bounds the competitive ratio of any online algorithm, prior work has focused on a resource augmentation analysis. A -speed -competitive algorithm is one where the algorithm achieves a competitive ratio of and the algorithm is given a machine that is a factor faster than the optimal solution. The consensus in the community is that the best positive theoretical result one can show is an algorithm that is -speed -competitive for any constant where is a function only depending on . In particular, the competitive ratio is independent of , e.g., . Such an algorithm is known as scalable. Showing an algorithm is scalable gives strong evidence that the algorithm will work well in practice.
The most natural algorithm is highest-density-first when jobs have weights. This algorithm prioritizes jobs in order of their weight over processing time. This algorithm is known to be -speed -competitive for total weighted flow time on a single machine . The algorithm has been generalized to many environments .
Iii Definitions and Preliminaries
As discussed, we will be studying the same model as . The main difference is the objective function.
Model and Scheduling Definition
There is a set of nodes , each representing a node in our network. Each vertex comes with a degree bound . A request (job) is a tuple , where are the source and destination respectively, is the size, is the release time, and is the weight. Note that without loss of generality we assume sizes and release times are natural numbers, since we can always adjust the scale of a time slot. In each round , we can create a graph with vertex set which satisfies the degree constraints, and where each edge is labeled with a request such that and . The request is completed once it has appeared in at least of these graphs. Note that as in  we are allowing only direct links (we do not allow data to be transferred over longer paths) and allow preemption. See  for more justification of this model.
Online vs Offline
Clearly scheduling problems in this context make sense both on- and offline. We will be concerned with the competitive ratio (the worst case cost of the algorithm divided by the optimal solution) of scheduling in the online setting. This the same as the approximation ratio, except we require the algorithm to be online.
Objective Function and Speed Augmentation
As discussed, Jia et al.  considered two objective functions: the makespan and the sum of completion times. We will mostly be concerned with a different measure of quality: the weighted sum of flow times. The flow time of a request is the time at which it completes minus its release time . That is, the flow time of a job is simply how long it is in the system before being completed. This is a more natural objective than the sum of completion times, but is also more difficulty to optimize. We will consider the objective of the weighted flow time, where our goal is to minimize .
Unfortunately, as we show in Section V, it is not possible to provide -competitive algorithm for the total flow time, even when all weights and sizes are unit. In the face of strong lower bounds we adopt the most popular form of analysis known as a resource augmentation analysis. Here we give the algorithm extra speed. An algorithm running with speed is able to process jobs at a rate that is times faster than the optimal solution. As discussed in Section I, this can be thought of as overprovisioning the network, and will allow us to design competitive algorithms for the flow time objective. Moreover, as discussed in Section II, this notion of speedup is relatively standard in the scheduling literature.
Iv Upper Bounds
In this section we give our algorithms and corresponding upper bound results. We begin in Section IV-A with a simple setting that serves to demonstrate most of the main ideas behind our algorithm and analysis. In Section IV-B we move to the most general online setting to prove our main results.
Iv-a Simple setting
We will begin with the simplest possible setting: when all degree bounds are equal to , all job sizes are , and all weights are . Note that, in particular, since all degree bounds are the set of jobs scheduled at any time form a matching.
At time , let be the (multi)graph of all jobs that are in the system at time (i.e., all requests with release times at most which have not already been completed). Order the jobs by release time (breaking ties arbitrarily but consistently), and then construct a maximal matching using this ordering. These are the jobs scheduled at time . For each job , let be the completion time of job (the time at which it is scheduled by this algorithm).
While the algorithm itself is simple and combinatorial, we will analyze it through an LP relaxation, and in particular through the technique of dual fitting. Let
denote the set of all jobs. Consider the following linear program.
While technically this LP has infinite size (since we did not put an upper bound on ), it is easy to see that we can put an upper bound on of , so this LP has finite size. It is easy to show that this is a feasible LP relaxation.
If there is a schedule with sum of flow times at most , then there is a solution to the LP of cost at most .
Consider a schedule with sum of flow times . Since this is a feasible schedule, each is a matching. We create an LP solution as follows: if job is scheduled at time , then we set , otherwise we set . Since the original schedule is feasible, every job is scheduled in some so the first LP constraint is satisfied, and similarly since each is a matching the second LP constraint is satisfied. Thus this is a feasible LP solution. By the definition of the ’s, the flow time in the schedule is precisely , and thus the LP objective is the sum of the flow times, . ∎
The dual of this LP is the following.
We will analyze our algorithm by finding a feasible dual solution and relating this to the cost of the algorithm. However, due to the lower bound in Section V, we will need to allow resource augmentation. Let denote the total flow time of the algorithm when run with speedup , i.e., when the algorithm processes jobs at a speed of .
Let’s now define our dual solution. But first we need a little bit of notation: for every node and time , let denote the degree of in . Then for every , we let . Similarly, we will set .
We first show that this is a feasible dual solution.
for all and .
We prove this by induction on . For the base case, let . Then
as claimed. Now consider some . Note that since we allow speedup , the number of jobs scheduled at one time that have some fixed node as an endpoint is at most (rather than at most ). Thus
We will now prove two lemmas which will allow us to bound the cost of this dual solution.
We first claim that in the algorithms (with speedup ), the flow time of job is at most . To see this, let be the set of jobs with and that have not been completed by time . Note that by definition. Now consider some time after job has been released. If job has not yet been completed, and is not scheduled at time , then some job must be scheduled at time . This is because the algorithm sorts by release time and constructs a greedy maximal matching in this order. In particular, if no job in is scheduled at time , then we will schedule job . Thus the time that spends in the system before being scheduled is at most . Since we have speedup , the flow time of job is at most .
This now allows us to analyze the variables. We get that
This is essentially a straightforward calculation using the fact that the flow time of a job is equal (by definition) to the number of time steps in which the job is in the system. So we have that
as claimed. ∎
We can now prove our main theorem (about this simple setting).
for any .
Iv-B General Online Model
This section considers the most general model. In this case each node has a degree bound denoting the maximum number of jobs involving that can be scheduled at any point in time. A job has size and a weight . We will assume there is no restriction on how much a job is scheduled, so long as the degree constraints are satisfied at the vertices. We will let be the density of job . The goal is to optimize the total weighted flow time .
This section is organized as follows. We first give our algorithm, which is simple and natural (highest-density-first). We then spend most of the section analyzing it. To do this, we show that we can focus on a different objective called weighted fractional flow time. We will call the original objective weighted integral flow to differentiate them. We show that if the algorithm performs well for the fractional objective then the algorithm performs well for the integral objective with slightly more speed up. Once we focus on the fractional flow objective, we can further show that we may assume all jobs are unit time in the analysis after scaling the weights. We note that both of these reductions are done to simplify the analysis – the algorithm itself does not change or make any of these assumptions, and could be analyzed directly (although doing so is more technical and complicated).
With these simplifications and reductions in place we perform a dual-fitting analysis of the algorithm. As in the simple case of Section IV-A, the intuition is that the dual variables correspond to the “extra cost” to the algorithm incurred by a job when it arrives. This is more complicated than in the simple setting due to the addition of weights and job size (or just weights after the reductions), but the ideas are the same.
Iv-B1 Algorithm: Highest-Density-First
Recall that is the set of released but uncompleted jobs at time . When scheduling, we say a node is saturated if it schedules jobs adjacent to it. Order the jobs in in decreasing order of their density. In this order, schedule job if the two endpoints for are not saturated. We note that we schedule job as must as possible if its endpoints are not saturated, that is, we will create parallel links between the endpoints until one of them is saturated or the job is completely scheduled.
Iv-B2 Reduction to the Unit Time Case
This section is devoted to proving the following lemma, stating that we may assume in the analysis that each job is restricted to only being unit size but arbitrary weight. This transformation is done only to simplify the analysis; the algorithm itself is unaffected.
If highest-density-first is -speed -competitive on unit size instances, then highest-density-first is -speed -competitive for arbitrary size and arbitrary weight instances.
To prove the lemma first consider a different objective called weighted fractional flow time. To make the distinction between these objectives, we call the original objective weighted integral flow time. Recall that is the released but uncompleted jobs at time . For each job let be the remaining size of job at time . Then we define the weighted fractional flow time to be . In this objective, each job pays at each time it is alive and unsatisfied. Note that the original weighted integral flow time objective is equivalent to , and hence the difference between the two objectives is that in the fractional objective the weight of a job is scaled by (the fraction of the job size that is uncompleted).
We now show that we can convert any algorithm for fractional flow to one for integral flow time (and thus in particular the highest-density-first algorithm).
Given any online algorithm with -speed that is -competitive for fractional flow time, for any there is an online algorithm that is -speed -competitive for integral flow time. Further if is highest-density-first, so is .
Consider the algorithm for fractional flow time. Each time schedules a job with speed the algorithm processes the same job with speed . If the job has already been completed in then can either be idle or work on some other job (e.g., the remaining with highest density). Clearly the schedule produced by algorithm is feasible if the schedule produced by algorithm is feasible, since no job is scheduled by before it is released. Notice that if is highest-density-first then can be highest-density-first. This is because highest-density-first has the property that if the algorithm is given more speed then the algorithm will either process the same job as the slower schedule or the algorithm will have completed the job.
Fix any job . Consider the first time where a fraction of is completed in . So for all . Thus every with contributes or more to the objective. Since schedules job at the same times or earlier as with speed a factor faster, will complete the job by time . So pays at most for each with , while pays at least . Hence the ratio between the two costs is at most .
This holds for all jobs. Further, the fractional optimal objective is only less than the integral optimal objective. This gives the lemma. ∎
The previous lemma shows that we may focus on the weighted fractional flow time objective. The next lemma shows that we can further restrict the instance to unit size jobs. Combining these two lemmas will allow us to focus on the unit size case.
For the fractional flow time objective, any instance can be transformed to a different problem instance such that (1) the objective for the highest-density-first algorithm is the same on both instances, (2) the optimal objective is only less on the new instance, and (3) in the transformed instance all jobs are unit size.
Fix any instance. Consider transforming any job into new jobs . Each new job has size and weight . Note that the density of the jobs are the same as for all .
Consider any schedule for the original instance. We create the analogous schedule for the new instance. Whenever a job is processed by for units at some time , jobs are processed by such that is the lowest index possible among unsatisfied jobs. Both schedules then are intuitively working on the same job at the same times. Notice that is highest-density-first on the original instance if and only if is the highest-density-first algorithm on the new instance, since the density of the jobs in are the same as .
The fractional flow time objective is the same for and because each time decreases by , the weight of in changes from to . Similarly in , there are jobs alive in and this decreases by . Their weight was and this decreases to . ∎
Lemmas IV.7 and IV.8 imply that if highest-density-first is -speed -competitive for unit-size jobs with respect to weighted fractional flow time, then for any , highest-density-first is -speed -competitive for general size jobs with respect to weighted integral flow time. But for unit-size jobs, the fractional flow time is equal to the integral flow time. Thus we have proved Lemma IV.6.
As in the simple setting of Section IV-A, we perform a dual fitting argument. Lemma IV.6 ensures that it is sufficient for us to analyze highest-density-first on instances where all jobs have unit size. Notice that in this case, highest-density-first simply prioritizes jobs in order of largest weight. Consider the following linear program, where is a variable denoting how much is processed at time (in a true solution this will be either or ).
As before we do not solve this LP, but rather use it only for analysis purposes. Note that the objective is the integral flow time. The first set of constraints ensures each job is fully scheduled. The second set of constraints ensures that the degree constraints are satisfied.
If there is a schedule with weighted sum of flow times at most , then there is a solution to the LP of cost at most .
Consider a schedule with weighted sum of flow times . Since this is a feasible schedule, each satisfies the degree constraint at each vertex. We create an LP solution as follows: if job is scheduled at time then we set , otherwise we set . Since the original schedule is feasible, every job is scheduled at some point and thus the first LP constraint is satisfied. Similarly, since each satisfies the degree constraints, the second set of LP constraints are satisfied. Thus this is a feasible LP solution. The objective is the weighted flow time of the resulting schedule. ∎
The dual of this LP is the following.
We will analyze our algorithm (highest-density-first, equivalent to highest-weight-first) by finding a feasible dual solution and relating this to the cost of the algorithm using resource augmentation. Let denote the total flow time of the algorithm when run with speedup .
Let’s now define our dual solution. But first we need a little bit of notation: for every node and time , let denote the total weight of jobs adjacent to that have been released but are unsatisfied at time . Let (resp. ) be the jobs alive at time that share the end point (resp. ) with . Then for every , we set the variables as follows.
It is not hard to see that, as in the simple setting of Section IV-A, these dual variables correspond to an upper bound on the increase in the algorithm’s cost due to the existence of job . Indeed, consider the first two terms depending on jobs . The first term states that job will wait on all jobs in that have higher weight, and pay for each such time step. The second term states that all lower weight jobs than will now need to wait on job before they are completed. The last two terms are the same, but for jobs in . Note that this expression is more complicated than in the simple setting since now we order by weights rather than by release time, so earlier jobs can be “pushed back” due to job (unlike in the simple case).
Similarly, we will set , which is essentially the weighted version of the same dual variable in Section IV-A.
We first show that this is a feasible dual solution. Clearly all variables are nonnegative, so we just need to show the following lemma.
for all and .
Consider any time . We have the following.
We now bound the first and third term by , this is, half of the right hand side of the constraint. The second and fourth will behave similarly. Together, this will show the constraint is satisfied.
We have the following.
Consider the last term. Let denote the set of jobs in that are completed (processed) by time . Then (2) is at most the following, with equality if no jobs arrive during .
Now we can use some of the jobs which appear in the the last term to cancel out the same jobs in the second term, and then use the relationship in the summations between the weights of jobs and to rewrite everything in terms of . This gives that (IV-B3) is
Now we combine the first term with the second to get that (3) is equal to
We know that because the algorithm can processes at most jobs at each time step adjacent to and are jobs processed at during . Thus (IV-B3) is at most . Putting this all together, we have that
This bounds the first and third term of equation . The second and fourth have the exact same analysis bounding them by . Putting them together implies that is bounded by , proving the lemma. ∎
We will now prove two lemmas which will allow us to bound the cost of this dual solution. Let denote the total weighted flow time of the online algorithm.
Recall that denotes all jobs that have not yet been processed by time which have as one endpoint (including job itself), and similarly for . Then we have that
The second equality has arranged terms as follows. Fix job . The first term counts jobs that require node , have higher weight than , and are released and unsatisfied when arrives; this term comes from . The second term counts jobs with higher weight than , that require node , and arrive during when is released at unsatisfied; this term comes from each such . The last two terms are analogous for node .
The final inequality is because the ’th term in the sum of (5) is an upper bound on the weighted flow time of job . This is because the only jobs which can prevent job from finished are either higher-weight jobs that show up earlier than at (the first term), higher-weight jobs which show up at after before job has finished (the second term), and similarly for jobs which show up at (the third and fourth terms). Then we multiply these jobs by the rate at which they are processed ( or ). ∎
Next we bound the contribution of the variables.
This is essentially a straightforward calculation using the definition of weighted flow time and the fact that each job has two endpoints. Let be the completion time of in highest-density-first’s schedule. We have the following.
We can now prove our main theorem. In the following, let be the optimal solution (without speedup).
for any .
Finally we get our main theorem by combining the previous lemma with the reduction to the unit time instance in Lemma IV.6. Note that by setting to any appropriate constant (say, ), Theorem IV.14 gives an -competitive algorithm with -speedup.
Highest-density-first is -speed -competitive for the total flow time objective when jobs have arbitrary sizes and weights and the degree bounds are arbitrary for any .
Iv-C Completion Times
We now claim that Theorem IV.14 implies there is a -competitive for the total (weighted) completion time objective function, even without any speedup. To see this, we argue that we can simulate speed-up for the total completion time objective by losing a factor in the competitive ratio. Given any online schedule using -speed, construct a online schedule using -speed as follows. Each job scheduled with speed at time in is scheduled during the interval in . This ensures a job completed at time in is completed at time in . Thus, each job pays an extra factor of at most in the completion time, so this extra factor goes directly into the competitive ratio.
More formally, we prove the following (where we make no attempt to optimize the constant).
There is a -competitive for the total completion time objective when jobs have arbitrary sizes and weights and the degree bounds are arbitrary.
Let denote the cost of the optimal schedule with respect to weighted completion times, and let denote the completion time of job in this schedule. Note that the total weighted flow time of this schedule is .
Let denote the completion time of job when we run highest-density-first with -speed. Then Theorem IV.14 implies that . Now by stretching out time as described earlier, we get a new schedule where job completes at time at most . Putting this together, we get that