A flow shop problem to minimize the makespan (also known as Johnson’s problem
) is probably the first machine scheduling problem described in the literature[J53]. It can be set as follows.
Flow shop problem. Sets of machines and of jobs are given, each machine has to process each job ; such an operation takes time units. Each job has to be processed by machines in the same order: first by machine , then and so on. No machine can process two jobs simultaneously. The goal is to construct a feasible schedule of processing all the jobs within the minimum makespan (which means, with the minimum completion time of the last operation). According to the traditional three-field notation of scheduling problems (see [LaLeRiSh93]), Johnson’s problem with a fixed number of machines is denoted as .
Problem can be solved to the optimum by the well-known Johnson’s algorithm, which basically is a sorting of the set of jobs according to Johnson’s rule [J53]. On the other hand, problem is NP-hard in the strong sense [GaJoSe76].
In classical scheduling problems (including flow shop), it is assumed that the location of each machine is fixed, and either there is no pre-specified delay between the processing of two consecutive operations of a job or such a delay depends on the distance between the corresponding machines. However, this assumption often diverges from real-life situations. Imagine that the company is engaged in the construction or maintenance of country houses, cottages or chalets. The company has several crews which, for example, specialize either in preparing the site for construction, or filling the foundation, or building a house, or landscaping the site. The facilities are located in a suburban area, and each team must move from place to place to carry out their work. The sequence of jobs performed by various crews is fixed, e.g., you cannot start to build a house before filling the foundation.
To take into account the situation described above, we consider a natural combination of with the well-known traveling salesman problem, a so-called routing flow shop problem introduced in [AvBe96]. In this model, jobs are located at nodes of a transportation network , while machines have to travel over the edges of the network to visit each job and perform their operation in the flowshop environment. All machines start from the same location (the depot) and have to return to the depot after performing all the operations. The completion time of the last machine action (either traveling or processing an operation of some job in the depot) is considered to be the makespan of the schedule () and has to be minimized. (See Sect. 2 for the detailed formulation of the problem.)
We denote the -machine routing flow shop problem as or , when we want to specify a certain structure of the transportation network.
The routing-scheduling problems can simulate many problems in real-world applications. Examples of applications where machines have to travel between jobs include situations where parts are too big or heavy to be moved between machines (e.g., engine casings of ships), or scheduling of robots that perform daily maintenance operations on immovable machines located in different places of a workshop [AvBe99]. Another interesting application is related to the routing and scheduling of museum visitors traveling as homogeneous groups [YuLinChou10]. The model is embedded in a prototype wireless context-aware museum tour guide system developed for the National Palace Museum of Taiwan, one of the top five museums in the world.
The routing flow shop problem is still understudied. Averbakh and Berman [AvBe96] considered with exactly one job at each node, under the following restriction: each machine has to follow some shortest route through the set of nodes of the network (not necessarily the same for both machines). This will be referred to as an AB-restriction. They proved that for the two-machine problem the AB-restriction affects the optimal makespan by a factor of at most , and this bound is tight. They also showed that, under this restriction, there always exists a permutation optimal schedule, in which machines process jobs in the same order (a permutation property). Using this property, they presented algorithms for solving -restriction to the optimum, where is a tree or a cactus, is the number of jobs. These algorithms, therefore, provide a -approximation for the problem without the AB-restriction on a tree or on a cactus with a single job at each node. Later on ([AvBe99]), they extended these results to the case of an arbitrary graph and an arbitrary number of machines by presenting a -approximation algorithm for the problem. Yu and Znang [YuZh11] improved on the latter result and presented an -approximation algorithm based on a reduction of the original problem to the permutation flow shop problem.
A generalized routing flow shop problem with buffers and release dates of jobs was also considered in [JoMa14]
. The authors present a heuristic based on solving the corresponding multiple TSP.
Yu et al. [YuZhWaFa11] investigated the problem with a single job at each node farther. They obtained the following results:
The permutation property also holds for the problem without the AB-restriction.
The problem is ordinary NP-hard, even if is a tree (moreover, if is a spider of diameter 4 with the depot in the center).
There is a -approximation algorithm that solves the problem in time.
Finally, the possibility of designing a polynomial-time algorithm for the special case of our problem, when the transportation network is symmetric, was claimed in [CheKoSe18] (although, without any proof).
In the present paper, we investigate the generalization of problem to the case of asymmetric travel times and of an arbitrary number of jobs at any node. Thus, we have to consider a directed network in which the travel times through an edge may be different in the opposite directions. (We will denote such a problem by .) We prove that the permutation property holds for this version of the problem, as well. We also establish another important property: there exists an optimal permutation schedule (with the same job processing order on both machines) such that for each node , sub-sequence of consisting of all jobs from node obeys Johnson’s rule. These two properties allow us to design a dynamic programming algorithm which solves this problem in time , where is the number of nodes in . Thereby, we have established a polynomial-time solvability of the asymmetric two-machine routing flow shop problem with a constant number of network nodes. This result stands in contrast with the complexity result for the two-machine routing open shop problem, which is known to be ordinary NP-hard even if consists of only two nodes (including the depot) [AvBeCh06].
The structure of the paper is as follows. Section 2 contains a formal description of the problem under investigation, as well as some notation and definitions. Properties of an optimal schedule are established at the beginning of Section 3 which also contains a description of the exact algorithm for solving the problem. The analysis of its qualities follows in Section 4. Section 5 concludes the paper with some open questions for further investigation.
2 Problem setting, definitions and notation
Farther, throughout the paper, an expression of the form (where and are integers, and is an integer variable, by definition) means that takes any integral values from this interval; . In this paper we will consider the following problem.
Problem . We are given jobs that are to be processed by two dedicated machines denoted as and . For each , job consists of two operations that should be performed in the given order: first the operation on machine , and then on machine . Processing times of the operations are equal to and , respectively. All jobs are located at nodes of a transportation network; the machines move between those nodes along the arcs of that network. At the beginning of the process, both machines are located at a node called a depot, and they must return to that very node after completing all the jobs.
Without loss of generality of the problem (and for the sake of convenience of the further description and analysis of the algorithm presented in Section 3), we will assume that a reduced network is given, in which: (1) only active nodes are retained, i.e., the nodes containing jobs (they will be referred to as job nodes) and two node-depots: the start-depot and the finish-depot; (2) there are no jobs in both depots (otherwise, we split the original depot into three copies, the distances between which are equal to zero; one of those copies is treated as a job node, while the other two are job-free); the start-depot and the finish-depot get indices 0 and , respectively, while all job nodes get indices ( is the number of job nodes); thus, starting from the start-depot, each machine will travel among the job nodes, and only after completing all the jobs it may arrive at the finish-depot; (3) is a complete directed graph in which each arc is assigned a non-negative weight representing the shortest distance between the nodes corresponding to and in the source network in the given direction; therefore, the weights of arcs satisfy the triangle inequalities; at that, the symmetry of the weights is not assumed, i.e., the weights of the forward and the backward arcs may not coincide. The objective function is the time, when machine arrives at the finish-depot in schedule , and this time should be minimized.
Other designations: , where denotes the number of jobs located at job node . denotes the 1-norm
Given an integer , we define a partial order on the set of -dimensional real-valued vectors, such that for any two vectors , the relation holds, if and only if . By , we will denote the set of indices of jobs located at node .
By a schedule, we will mean, as usual, the set of starting and the completion times of all operations. Since, however, such a schedule model admits a continuum set of admissible values of its parameters, it will be more convenient for us to switch to a discrete model in which any schedule is determined by a pair of permutations specifying the orders of processing the jobs by machines and , respectively. Each pair uniquely defines both the routes of the machines through the nodes of network and an active schedule of job processing which is defined as follows.
A schedule is called active, iff: (1) it is feasible for the given instance of problem ; (2) it meets the precedence constraints imposed by permutations ; (3) the starting time of no operation in this schedule can be decreased without violating the above mentioned requirements.
An active schedule is called a permutation one, if .
For each , we define a priority vector of job , where , if , and , otherwise. We next define a strict linear order on the set of jobs: for two jobs the relation holds, iff (i.e., vector is lexicographically less than ). Clearly, for any two jobs , one and only one of two relations holds: either or .
We will say that a permutation of jobs and the corresponding permutation schedule meet the Johnson local property, if for each node the jobs from are sequenced in permutation properly, which means: in the lexicographically increasing order of their priority vectors. (Johnson [J53] showed that in the case of the networkless two-machine flow shop problem, such a job order provides the optimality of the corresponding permutation schedule.)
3 Properties of the optimal schedule and an algorithm for the exact solution of problem
The algorithm described in this section is based on two important properties of the optimal schedule established in the following theorems.
For any instance of problem there exists an optimal schedule which is a permutation one.
For any instance of problem there exists a permutation schedule which meets the Johnson local property and provides the minimum makespan on the set of all permutation schedules.
The proofs of these theorems are omitted due to the volume limitations. They can be found in Appendix. Two theorems above imply the following
For any instance of problem there exists an optimal schedule which is a permutation one and meets the Johnson local property.
The algorithm for computing the exact solution of problem is based on the idea of Dynamic Programming and on the two properties of optimal solutions mentioned in Corollary 1 (and, thus, enabling us to restrict the set of schedules under consideration by job sequences which meet these properties). So, from now on, we will consider only permutation schedules which meet the Johnson local property.
Let us number the jobs at each node properly, i.e., in the ascending order of the relation (see Definition 1, p. 1). Then, due to Theorem 3.2, jobs at each node should be processed in the order . According to this order, the jobs at node will be numbered by two indices: .
In the schedule under construction, we will highlight the time moments when a machinecompletes a portion of jobs at node and is preparing to move to another node. Each such moment will be called an intermediate finish point of machine or, in short, an if-point of machine . It follows from Theorem 3.2 that at each if-point of machine the set of jobs already completed by the machine is a collection of some initial segments of sequences . This collection can be specified by a -dimensional integral vector (and will be denoted as ), where denotes the number of jobs performed by machine at node by time .
By Theorem 3.1, machine completely reproduces the route of machine through network nodes (as well as the order of processing the jobs by that machine) and, at some (later) point in time , it also finds itself at its if-point with the same set of completed jobs, defined by vector . Thus, a natural correspondence is established between the if -points of machines and : they are combined into pairs of if-points at which the sets of jobs completed by machines and coincide and are defined by the same vector . The pairs of if-points divide the whole process of performing the jobs by machines and into steps (), each step being defined by two parameters: the node index () and the number of jobs () performed in this step at node .
The tuple consisting of a value of vector and a value of a node index determines a configuration of a partial schedule of processing the subset of jobs , with the final job at node . The set of admissible configurations is defined as the set including all basic configurations (with values , such that ), as well as two special configurations: the initial one and the final one .
Algorithm for constructing the optimal schedule makes two things: 1) it enumerates all possible configurations of partial schedules, and 2) for each of them, it accumulates the maximum possible set of pairwise incomparable solutions (characterized by pairwise incomparable pairs of if-points with respect to the relation ). In other words, given a configuration , we consider a ‘‘partial’’ bi-criteria problem of processing the jobs from , with the final job at node . The objective is to minimize the two-dimensional vector-function , where are the completion times of jobs from by machines and , respectively. We compute the complete set of representatives of Pareto-optimal solutions of this problem.
For each solution , let us define the parameter . The set for each configuration will be stored as the list sorted in the ascending order of component . (At that, the values of and strictly decrease.) The first element of each list will be a solution with the value . This is either a dummy solution (added to each list at the beginning of its formation), or a real solution with the value (if it is found).
In the course of the algorithm, configurations are enumerated (in order to create lists ) in non-decreasing order of the norm of vectors . At that, the whole algorithm is divided into three stages: the initial, the main and the final one. Configurations with are considered in the initial and the final stages only.
In the initial stage, list for the initial configuration is created. It consists of the single solution .
In the main stage, for each , vectors are enumerated in lexicographical ascending order; for each given vector , those values of are enumerated only for which holds.
In the final stage, for the final configuration , we find its optimal solution by comparing variants of solutions obtained from the optimal solutions of configurations . For each configuration , its optimal solution (with the minimum value of the component ) is located at the very end of list . Having added to the distance from node to the depot, we obtain the value of the objective function of our problem for the given variant of schedule . Having chosen (from variants) the variant with the minimum value of the objective function, we find the optimum.
To create list for a given configuration of the main stage, we enumerate such values of the configuration obtained at the completion of the previous step of the algorithm (we will call that configuration a pre-configuration, or ‘‘p-c’’, for short), that , and that the vectors and differ in exactly one (th) component, so as . At that, if , then , which means that is the initial configuration. If, alternatively, , then . (Clearly, there is no need for a machine to come to node without doing any job at it.)
We note that for each configuration of the main stage, each variant of its p-c can be uniquely defined by the pair , where , and is the number of jobs being processed in this step at node . The pairs are enumerated so as the loop on is an exterior one with respect to the loop on .
For each given value of , we construct an optimal schedule in problem for the jobs from , and then compute three characteristics of that schedule: and , which are the total workloads of machines and on the set of jobs , and also , where is the length of schedule .
After that, we start the loop on in which we will adjust the current list of solutions for configuration . (Before starting the loop on , the list consists of the single dummy solution .) At each , for the p-c , we enumerate its Pareto-optimal solutions in the ascending order of (and the descending order of ). Given a solution and schedule , we form a solution for configuration as follows.
Case means that the component does not affect the parameters of the resulting solution any more, and so, considering further solutions (with greater values of and smaller values of ) makes no sense, since it is accompanied by a monotonous increasing of both and (between which, a constant difference is established equal to ). Thus, for any given p-c , a solution of ‘‘type (b)’’ can be obtained at most once.
For each solution obtained, we immediately try to understand whether it should be added to the current list , and if so, whether we should remove some solutions from list (majorized by the new solution ).
To get answers to these questions, we find a solution in list with the maximum value of the component such that . Such a solution always exists (we call it a control element of list ). Since in the loop on , component monotonously increases, the search for the control element matching can be performed not from the beginning of list , but from the current control element. Before starting the loop on , we assign the first item of list to be the current control element.
If the inequality holds, the current step of the loop on ends without including the solution in list (we pass on to the next solution ). Otherwise, if , we look through list (starting from the control element ) and remove from the list all solutions majorized by the new solution (which is expressed by the relations ). At that, the condition is sufficient for removing the current control element, while the inequality is sufficient for removing subsequent elements. The scanning of list stops as soon as either the first non-majorized list item is found distinct from the control element (for this item and for all subsequent items, the relations hold), or if the list has been scanned till the end. Include solution in list and assign it to be a new control element, which completes the current step of the c-loop.
4 The analysis of algorithm
Algorithm finds an optimal solution of problem in time .
, it remains to show the validity of bounds on the running time of the algorithm; to that end, it is sufficient to estimate the running time () of the Main stage of the algorithm.
In the Main stage, for each basic configuration , the set of all its Pareto-optimal solutions is found. Since this set is formed from the solutions obtained in the previous steps of the algorithm for various pre-configurations of configuration , the obvious upper bound on the value of is the product of the number of configurations (), of the number of pre-configurations () for a given configuration, and of the bound () on the running time of any step of the loop on configurations and pre-configurations (called a c-loop).
In each step of the c-loop, list of solutions of a given p-c is scanned. From each such solution, a solution for configuration is generated which is then either included or not included in list . The solutions included in the list in this step of the c-loop will be called ‘‘new’’ ones; other solutions, included in before starting this step will be called ‘‘old’’.
While estimating a new solution claiming to be included in , we scan some ‘‘old’’ solutions of list , which is performed in two stages. In the first stage, we look through the elements from , starting from the current control element, in order to find a new control element immediately preceding the applicant. In the second stage (in the case of the positive decision on including the applicant in the list), we check the (new) control element and the subsequent elements from subject to their removal from the list (if they are majorized by the applicant). We continue this process until we find either the first undeletable element or the end of the list. We would like to know: how many views of items of list will be required in total in one step of the c-loop? It is stated that no more than , where is the maximum possible size of list in any step of the algorithm for all possible configurations .
To prove this statement, we first note that none of the ‘‘new’’ elements included in list in this step of the c-loop will be deleted in this step, since all ‘‘new’’ solutions included in the list are incomparable by the relation . This follows from the facts that: 1) all applicants formed by type (a) are incomparable; 2) if the last solution is formed by type (b), then it is either incomparable with the previous applicant, or majorized by it (and therefore, is not included in the list). Thus, only ‘‘old’’ elements will be deleted from the list, and the total (in the c-loop step) number of such deletions does not exceed .
In addition, the viewing of an element from , when it receives the status of a ‘‘control element’’, occurs at most once during each c-loop step, and so, the total number of such views in one step does not exceed . There may be also ‘‘idle views’’ of elements subject to assigning them the status of a ‘‘control element’’. Such an idle view may happen only once for each applicant, and so, the number of such idle views during one step of the c-loop does not exceed .
Next, the total (over a step of the c-loop) number of views of elements from subject to their removal from the list does not exceed , as well. Indeed, viewing an element of with its removal occurs, obviously, for each element at most once (or, in total over the whole step, at most times). Possible ‘‘idle view’’ of an element from (without its deleting) happens at most once for each applicant, which totally amounts (over the current step of the c-loop) at most . Thus, the total number of views of items from , as well as the total running time of the c-loop step (), does not exceed . Let us estimate now number itself.
We know that for any given configuration the solutions from list are incomparable with respect to relation . Thus, the number of elements in list does not exceed the number of different values of the component . The value of the component is the sum of the workload of machine and the total duration of its movement. (There are no idle times of machine in the optimal schedule.) Since the workload of machine (for a fixed configuration ) is fixed, the number of different values of the component can be bounded above by the number of different values that the length of a machine route along the nodes of network can take. As we know, each passage of the machine along the arc is associated with the performance of at least one job located at node . Thus, any machine route contains arcs entering node , and the same number of arcs () leaving the node.
Let us define a configuration of a machine route as a matrix of size , where specifies the multiplicity of passage of an arc in the route; . Thus, for any , the equality holds:
Clearly, for any closed route the following equalities are also valid:
Hence, it follows that the number of different values of the route length of a machine does not exceed the number of configurations of a closed route. The latter does not exceed the number of different matrices with properties (1) and (2). Let us (roughly) estimate from above the number () of such matrices without taking into account property (2).
The number of variants of the th row of matrix does not exceed the number of partitions of the number into parts, i.e., is not greater than
Since the value of depends only on , we obtain an upper bound of the form
Denote . Then , and the number of configurations () can be bounded above by . Finally, the number of pre-configurations is bounded by . Taking into account the above bounds, the bound , and the boundedness of the parameter by a constant, we obtain the final bound on the running time of the algorithm:
Theeorem 4.1 is proved.∎
We have considered the two-machine routing flow shop problem on an asymmetric network (). We have improved the result by Yu et al. [YuZhWaFa11] by showing that for a more general problem (the problem with an arbitrary asymmetric network) the property of existing an optimal permutation schedule also holds. Next, we have presented a polynomial time algorithm for the problem with a fixed number of nodes, which is the first positive result on the computational complexity of the general problem.
We now propose a few open questions for future investigation.
Question 1. What is the parametrized complexity of problem with respect to the parameter ?
Question 2. Are there any subcases of problem with unbounded (e.g., is a chain, or a cycle, or a tree of diameter 3, or a tree with a constant maximum degree, etc.) solvable in polynomial time?
Question 3. Are there any strongly NP-hard subcases of problem for which NP-hardness is not based on the underlying TSP? In other words, is it possible that for some graph structure the TSP on is easy, but problem is strongly NP-hard?
Proof of Theorem 3.1. For the reasons of convenience, in the proof of this theorem only we will assume that each job is located at a separate job node. This enables us to assume that each node is visited by each machine only once, and thus the route of each machine in this model is a Hamiltonian path in a directed network from node 0 to node , or a permutation of indices from 0 to starting with and ending with index . (The set of such permutations will be denoted as .) At that, network admits arcs of zero length.
Let and denote strict and non-strict precedence of node to node in a route (in particular, admits ). We will assume that job is located at the node with the same index . Given a permutation , and will denote the sets of jobs located in the initial segments of sequence including or excluding job , respectively.
In the course of the proof, we will construct dense schedules determined by three parameters: a time moment and permutations specifying the routes of machines and . The set of such schedules will be denoted as . Each schedule is constructed according to the following rules: machines and , starting from node 0 at moments 0 and , respectively, follow (without any idle times) the routes specified by the permutations and , spending all the time just for processing the jobs and moving between nodes.
It is clear that schedule can be infeasible for some values of parameters . At that, for any pair of permutations there exists such a value for which the corresponding schedule is feasible, and its length coincides with the length of the active schedule . It is also clear that is the minimum possible value of for which schedule is feasible. Such a value is uniquely defined for any given pair ; this function will be denoted as .
In fact, schedule is feasible, iff machine arrives at each node not earlier than machine completes its operation of job :
where denote the total length of operations of machines and over the jobs from set , is the length of path from the depot to node .
Given an instance of problem , let be such an optimal schedule in which machine follows the shortest route around network nodes (among all routes of machine in optimal schedules). Let be the routes of machines and in that schedule ; and . Then schedule is feasible and optimal.
Let us number the nodes of network (as well as jobs) according to the order of their passing by machine : nodes are numbered by indices from 0 to , and jobs by indices from 1 to . Thus, . Let us define in sequence a sub-sequence of marked nodes by the recursion: , . In other words, we go along the route of machine and ‘‘mark’’ the nodes according to a simple algorithm: first, we mark node 0; next, we mark the first met node with a larger index, and so on, until we arrive at node (which we also mark). Then we have: and . It can be also easily seen that is a sub-sequence of both sequences: and .
Let denote the set of marked nodes. Other nodes will be called mobile ones. We denote by the sets of mobile nodes being passed by machines and between two consecutive marked nodes: and . (These sets will be referred to as segments of permutations and .) Clearly, .
While speaking on the difference between permutations and , it can be observed that each mobile node is located in in a segment with a lesser index than in . (For example, all elements of come there from .) Based on this property of permutations , procedure Trans described below transforms the route of machine step by step, transferring exactly one mobile node to a new position in each step. In the course of this transformation, the current (variable) permutation specifying the route of machine will be denoted by . Since this transformation of the route of machine leaves the mutual order of the marked nodes stable, we can transfer the above definition of segments (of permutations and ) onto permutation .
At the end of procedure Trans, the route of machine will coincide with the route of machine (i.e., we will have ), and the corresponding dense schedule will become a permutation one. After completing the description of procedure Trans, Lemma 1 is proved providing some important properties of schedules obtained in steps of the procedure.
Procedure is divided into stages, where in the th stage () we consider the transfer of all mobile nodes from the th segment of permutation to their ‘‘proper places’’, i.e., to those segments where they stand in permutation , and in the ascending order of their numbers. (The first stage is, therefore, empty, since .) The th stage is divided into steps, where in each step the transfer of the current mobile node standing in the current permutation at position is performed. Clearly, each such transference of a node to one of the preceding segments reduces by 1 the number of mobile nodes in the current (th) segment of permutation , and so, after a finite number of steps, we will see in this position the marked node , which means the completion of the stage.
We notice that in any step of stage , all nodes preceding in the marked node (inclusively) are sequenced in in the ascending order of their numbers.
A dense schedule obtained after each step of procedure Trans is feasible. At that, machine arrives at each marked node in schedule not earlier (and by a not shorter way) than in schedule .