1 Introduction
The Job shop scheduling problem (JSP) is an important branch of production planning problems. The classical JSP consists of a set of independent jobs to be processed on multiple machines and each job contains a number of operations with a predetermined order. It is assumed that each operation must be processed on a specific machine with a specified processing time. The JSP is to determine a schedule of jobs, meaning to sequence operations on the machines. The flexible job shop scheduling problem (FJSP) is an important extension of the classical JSP due to the wide employment of multipurpose machines in the realworld job shop. The FJSP extends the JSP by assuming that each operation is allowed to be processed on a machine out of a set of alternatives, rather than one specified machine. Therefore, the FJSP is not only to find the best sequence of operations on a machine, but also to assign each operation to a machine out of a set of qualified machines. The JSP is well known to be strongly NPhard [1]. The FJSP is an even more complex version of the JSP, so the FJSP is clearly also strongly NPhard.
A typical objective of the FJSP is the makespan, which is defined as the maximum time for completion of all jobs, in other words, the total length of the schedule. However, to achieve a practical schedule for the FJSP, various conflicting objectives should be considered. In this paper, evolutionary algorithms (EA) have been applied to a multiobjective flexible job shop scheduling problem (MOFJSP) with three objectives, namely: The makespan, total workload and critical workload. We propose and adopt multiple initialization approaches to enrich the first generated population based on our definition of the chromosome representation; at the same time, diverse genetic operators are applied to guide the search towards offspring with a wide diversity; especially, we use an algorithm configurator to tune the parameter configuration; furthermore, two levels of local search are employed leading to better solutions. Our proposed FJSP multiobjective evolutionary algorithm (FJSPMOEA) can be combined with almost all MOEAs to solve MOFJSP, the experimental results show that FJSPMOEA can achieve the stateoftheart results with less computational effort when we merge it with NSGAIII [2].
The paper is organized as follows. The next section formulates the MOFJSP, which is the problem we are about to solve. Section 3 gives necessary background knowledge. Section 4 introduces the proposed algorithm and Section 5 reports the experimental results. Finally, Section 6 concludes the work and suggests future work directions.
2 Problem Formulation
The MOFJSP addressed in this paper is described as follows:

There are jobs and machines .

Each job comprises operations for , the th operation of job is represented by , and the operation sequence of job is from to .

For each operation , there is a set of machines capable of performing it, which is represented by and it is a subset of .

The processing time of the operation on machine is predefined and denoted by .
At the same time, the following assumptions are made:

All machines are available at time and assumed to be continuously available.

All jobs are released at time and independent from each other.

Setting up times of machines and transportation times between operations are negligible.

Environmental changes (such as machine breakdowns) are neglected.

A machine can only work on one operation at a time.

There are no precedence constraints among the operations of different jobs, and the order of operations for each job cannot be modified.

An operation, once started, must run to completion.

No operation for a job can be started until the previous operation for that job is completed.
The makespan, total workload and critical workload, which are commonly considered in the literature on FJSP (e.g., [3], [4]), are minimized and used as the three objectives in our algorithm. Minimizing the makespan can facilitate the rapid response to the market demand. The total workload represents the total working time of all machines and the critical workload is the maximum workload among all machines. Minimizing the total workload can reduce the use of machines; minimizing the critical workload can balance the workload between machines. Let denote the completion time of job , the sum of processing time of all operations that are processed on machine . The three objectives can be defined as follows:
(1)  
(2)  
(3) 
An example of MOFJSP is shown in Table 1 as an illustration, where rows correspond to operations and columns correspond to machines. In this example, there are three machines: , and . Each entry of the table denotes the processing time of that operation on the corresponding machine, and the tag means that a machine cannot execute the corresponding operation.
Job  Operation  
3    2  
5  7  6  
    2  
2  4  3  
2    1  
4  2  2  
3  5   
3 Related Work
3.1 Algorithms for MOFJSP
The FJSP has been investigated extensively in the last three decades. According to [5], EA is the most popular nonhybrid technique to solve the FJSP. Among all EAs for FJSP, some are developed for the more challenging FJSP: the MOFJSP which we formulated in section 2. [6], [3] and [4] are very successful MOFJSP algorithms and have obtained highquality solutions. [6]
proposed a multiobjective genetic algorithm (MOGA) based on the immune and entropy principle. In this MOGA, the fitness was determined by the Pareto dominance relation and the diversity was kept by the immune and entropy principle. In
[3], a simple EA (SEA) was proposed, which used domain heuristics to generate the initial population and balanced the exploration and exploitation by refining duplicate individuals with mutation operators. A memetic algorithm (MA) was proposed in
[4] and it incorporated a local search into NSGAII [7]. A hierarchical strategy was adopted in the local search to handle three objectives: makespan, total workload and maximum workload. In section 6, these algorithms have been compared with our algorithm on the MOFJSP.3.2 Parameter Tuning
EA involves using multiple parameters, such as the crossover probability, mutation probability, computational budget, as so on. The preset values of these parameters affect the performance of the algorithm in different situations. The parameters are usually set to values which are assumed to be good. For example, the mutation probability normally is kept very low, otherwise the convergence is supposed to be delayed unnecessarily. But the best way to identify the probability would be to do a sensitivity analysis: carrying out multiple runs of the algorithms with different mutation probabilities and comparing the outcomes. Although there are some selftuning techniques for adjusting these parameter on the go, the hyperparameters in EA can be optimized using the technique from machine learning.
The optimization of hyperparameters and neural network architectures is a very important topic in the field of machine learning due to the large number of design choices for a network architecture and its parameters. Recently, algorithms have been developed to accomplish this automatically since it is intractable to do it by hand. The MIPEGO
[8]is one of these configurators that can automatically configure convolutional neural network architectures and the resulting optimized neural networks have been proven to be competitive with the stateoftheart manually designed ones on some popular classification tasks. Especially, MIPEGO allows for multiple candidate points to be selected and evaluated in parallel, which can speed up the automatic tuning procedure. In our paper, we tune several parameters with MIPEGO to find the best parameter setting for them.
3.3 NsgaIii
NSGAIII is a decompositionbased MOEA, it is an extension of the wellknow NSGAII and eliminates the drawbacks of NSGAII such as the lack of uniform diversity among a set of nondominated solutions. The basic framework of NSGAIII is similar to the original NSGAII, while it replaces the crowding distance operator with a clustering operator based on a set of reference points. A widelydistributed set of reference points can efficiently promote the population diversity during the search and NSGAIII defines a set of reference points by Das and Denniss method [9].
In each iteration , an offspring population of size is created from the parent population of size using usual selection, crossover and mutation. Then a combined population =
is formed and classified into different layers (
, , and so on ), each layer consists of mutually nondominated solutions. Thereafter, starting from the first layer, points are put into a new population . A whole population is obtained until the first time the size of is equals to or larger than . Suppose the last layer included in is the th layer, so far, members in are points that have been chosen for and the next step is to choose the remaining points from to make a complete . In general (when the size of doesn’t equal to ), solutions from needs to be selected for .When selecting individuals from , first, each member in is associated with a reference point by searching the shortest perpendicular distance from the member to all reference lines created by joining the ideal point with reference points. Next, a niching strategy is employed to choose points associated with the least reference points in from . The niche count for each reference point, defined as the number of members in that are associated with the reference point, is computed. The member in associated with the reference point having the minimum niche count is included in . The niche count of that reference point is then increased by one and the procedure is repeated to fill the remaining population slots of .
NSGAIII is powerful to handle problems with nonlinear characteristics as well as having many objectives. Therefore, we decide to enhance NSGAIII in our algorithm for the MOFJSP.
4 Proposed Algorithm
The proposed algorithm, Flexible Job Shop Problem Multiobjective Evolutionary Algorithm (FJSPMOEA) can in principal be combined with any MOEA and help MOEAs solve the MOFJSP, whereas the standard MOEAs cannot solve MOFJSP solely. The algorithm follows the flow of a typical EA and generates improved solutions by using local search. Details of the following components are given in the next subsections.

Initialization: encode the individual and generate the initial population.

Genetic operators: generate offspring by crossover and mutation operators.

Local search: decode the individual and improve the solution with local search.
4.1 Initialization
4.1.1 Chromosome Encoding
The MOFJSP is a combination of assigning each operation to a machine and ordering operations on the machines. In the algorithm, each chromosome (individual) represents a solution in the search space and the chromosome consists of two parts: the operation sequence vector and the machine assignment vector. Let
denote the number of all operations of all jobs. The length of both vectors is equal to . The operation sequence vector decides the sequence of operations assigned to each machine. For any two operations which are processed by the same machine, the one located in front is processed earlier than the other one. The machine assignment vector assigns the operations to machines, in other words, it determines which operation is processed by which machine and the machine should be the one capable of processing the operation.The format of representing an individual not only influences the implementation of crossover and mutation operators, a proper representation can also avoid the production of infeasible schedules and reduces the computational time. In our algorithm, the chromosomal representation proposed by Zhang et al. in [10] is adopted and an example is given in Table 2.
Operation sequence  1  2  3  2  1  1  3 

Machine assignment  2  1  1  3  2  2  1 
In Table 2, the first row shows the operation sequence vector which consists of only job indexes. For each job, the first appearance of its index represents the first operation of that job and the second appearance of the same index represents the second operation of that job, and so on. The occurrence number of an index is equal to the number of operations of the corresponding job. The second row explains the first row by giving the real operations. The third row is the machine assignment vector which presents the selected machines for all operations. The operation sequence of the machine assignment vector is fixed, which is from the first job to the last job and from the first operation to the last operation for each job. The fourth row indicates the fixed operation sequence of the machine assignment vector and the fifth row shows the real machines of the operations. Each integer value in the machine assignment vector is the index of the machine in the set of alternative machines of that operation. In this example, is assigned to because is the first (and only) machine in the alternative machine set of (Table 1). The alternative machine set of is , the second machine in this set is , therefore, is assigned to .
4.1.2 Initial population
Our algorithm starts by creating the initial population. The machine assignment and operation sequence vectors are generated separately for each individual. In the literature, a few approaches have been proposed for producing individuals, such as global minimal workload in [11]; AssignmentRule1 and AssignmentRule2 in [12]. In our algorithm, several new methods are proposed, namely the Processing Time Roulette Wheel (PRW) and Workload Roulette Wheel (WRW) for initialising the machine assignment and the Most Remaining Machine Operations (MRMO) and Most Remaining Machine Workload (MRMW) for initialising the operation sequence. These new approaches have used together with some commonly used dispatching rules in initializing individuals on the purpose of enriching the initial population. When generating a new individual in our algorithm, two initialization methods are randomly picked from the following two lists; one for the machine assignment vector and one for the operation sequence vector.
Initialization methods for machine assignment

[leftmargin=1em,itemindent=0em]

Random assignment (Random): an operation is assigned to an eligible machine randomly.

Processing time Roulette Wheel (PRW): for each operation, the roulette wheel selection is adopted to select a machine from its machine set based on the processing times of these capable machines. The machine with the shorter processing time is more likely to be selected.

Workload Roulette Wheel (WRW): for each operation, the roulette wheel selection is used to select a machine from its machine set based on the current workloads plus the processing times of these capable machines. The machine with lower sum of the workload and processing time is more likely to be selected.
We propose PRW and WRW to assign the operation to the machine with less processing time or accumulated workload, at the same time, maintaining the freedom of exploring the entire search space.
Initialization methods for operation sequence

[leftmargin=1em,itemindent=0em]

Random permutation (Random): starting from a fixed sequence: all job indexes of (the number of job indexes is the number of operations of ), followed by all job indexes of , and so on. Then the array with the fixed sequence is permuted and a random order is generated.

Most Work Remaining (MWR): operations are placed one by one into the operation sequence vector. Before selecting an operation, the remaining processing times of all jobs are calculated respectively, the first optional operation of the job with the longest remaining processing time is placed into the chromosome.

Most number of Operations Remaining (MOR): operations are placed one by one into the operation sequence vector. Before selecting an operation, the number of succeeding operations of all jobs is counted respectively, the first optional operation of the job with the most remaining operations is placed into the chromosome.

Long Processing Time (LPT)[13]: operations are placed one by one into the operation sequence vector, each time, the operation with maximal processing time is selected without breaking the order of jobs.

Most Remaining Machine Operations (MRMO): operations are placed into the operation sequence vector according to both the number of subsequent operations on machines and the number of subsequent operations of jobs. MRMO is a hierarchical method and takes the machine assignment into consideration. First, the machine with the most subsequent operations is selected. After that, the optional operations in the subsequent operations on that machine are found based on the already placed operations. For example, if are placed operations, the current optional operation can only be chosen from , , and . In these optional operations, those which are assigned to the selected machine are picked and the one that belongs to the job with the most subsequent operations is placed into the chromosome. In this example, will be chosen if it is assigned to the selected machine because there are two subsequent operations for and only one subsequent operation for and . Note that it is possible that no operation is available on that machine, in that case, the machine with the second biggest number of subsequent operations will be selected, and so forth.

Most Remaining Machine Workload (MRMW): operations are placed into the operation sequence vector according to both the remaining processing times of machines and the remaining processing times of jobs. MRMW is a hierarchical method similar to MRMO. After finding the machine with the longest remaining process time and the optional operations on that machine, the operation which belongs to the job with the longest remaining process time is placed into the chromosome. Again, if no operation is available on that machine, the machine with the second longest remaining processing time will be selected, and so forth.
We propose MRMO and MRMW to give priority to both the machine and the job with the most number of remaining operations (MRMO) and the longest remaining processing time (MRMW).
4.2 Crossover
Crossover is a matter of replacing some of the genes in one parent with the corresponding genes of the other (Glover and Kochenberger [14]). Since our representation of chromosomes has two parts, crossover operators applied to these two parts of chromosomes are implemented separately as well. We propose two new crossover operators, Precedence Preserving Two Points Crossover (PPTP) and Uniform Preservative crossover (UPX), and use them together with several commonly adopted crossover operators. When executing the crossover operation in the proposed algorithm, one crossover operator for machine assignment and one operator for the operation sequence, are randomly chosen from the following two lists to generate the offspring.
Crossover operators for machine assignment

[leftmargin=1em,itemindent=0em]

No crossover

One point crossover: a cutting point is picked randomly and genes after the cutting point are swapped between two parents.

Two points crossover: two cutting points are picked randomly and genes between the two points are swapped between two parents.

Jobbased crossover (JX): it generates two children from two parents by the following procedure:

A vector with the size of the jobs is generated, which consists of random values 0 and 1.

For the job corresponding to value , the assigned machines of its operations are preserved.

For the job corresponding to value , the machines of its operations are swapped between two parents.


Multipoint preservative crossover (MPX)[15]: MPX generates two children from two parents by the following procedure:

A vector with the size of all operations is generated, which consists of random values and .

For the operations corresponding to value , their machines (genes) are preserved.

For the operations corresponding to value , their machines (genes) are swapped between the two parents.

Crossover operators for operation sequence

[leftmargin=1em,itemindent=0em]

No crossover

Precedence preserving one point crossover (PPOP) [17]: PPOP generates two children from two parents by the following procedure:

A cutting point is picked randomly, genes to the left are preserved and copied from parent1 to child1 and from parent2 to child2.

The remaining operations in parent1 are reallocated in the order they appear in parent2.

The remaining operations in parent2 are reallocated in the order they appear in parent1.
An example of PPOP is shown in Figure 1 and the cutting point is between the third and fourth operation. Red numbers in parent2 are the genes on the right side of the cutting point in parent1 and they are copied to child1 with their own sequence following the genes on the left side of the cutting point in parent1, and vice versa.


Precedence Preserving Two Points Crossover (PPTP): PPTP generates two children from two parents by the following procedure:

Two cutting points are picked randomly, genes except for those between the two points are preserved and copied from parent1 to child1 and from parent2 to child2.

Operations between the two cutting points in parent1 are reallocated in the order they appear in parent2.

Operations between the two cutting points in parent2 are reallocated in the order they appear in parent1.


Improved precedence operation crossover (IPOX)[16]: IPOX divides the job set into two complementary and nonempty subsets randomly. The operations of one job subset are preserved, while the operations of another job subset are copied from another parent.

Uniform Preservative crossover (UPX): UPX generates two children from two parents by the following procedure:

A vector with the size of all operations is generated, which consists of random values and .

For the operations corresponding to value , the genes are preserved and copied from parent1 to child1 and from parent2 to child2.

For the operations corresponding to value , the genes in parent1 are found in parent2 and copied from parent2 with the sequence in parent2, and vice versa.

4.3 Mutation
The mutation operator flips the gene values at selected locations. By forcing the algorithm to search areas other than the current area, the mutation operator is used to maintain genetic diversity from one generation of a population to the next. In our algorithm, insertion mutation and swap mutation (including one point swap and two points swap) are proposed and used.
Insertion Mutation Operator generates a new individual by the following procedure:

Two random numbers and (, ) are selected.

For the operation sequence vector, the operation on position is inserted in front of the operation on position .

For the machine assignment vector, a machine is randomly selected for both the operations on and on respectively. If the processing time on the newly selected machine is lower than that on the current machine, the current machine is replaced by the new machine. If the processing time on the new machine is longer than that on the old machine, there is only a 20% probability that the new machine replaces the old machine.
Swap Mutation Operator generates a new individual by the following procedure:

One random number () is selected or two random numbers and (, ) are selected.

For the operation sequence vector, with only one swap point , the operation on the swap point is swapped with its neighbour; with two swap points, the operations on position and are swapped.

For the machine assignment vector, the machine on position (and ) is replaced with a new machine by the same rule used in the insertion mutation operator.
4.4 Decoding and Local Search
Decoding a chromosome is to convert an individual into a feasible schedule to calculate the objective values which represents the relative superiority of a solution. In this process, the operations are picked one by one from the operation sequence vector and placed on the machines from the machine assignment vector to form the schedule. When placing each operation to its machine, local search (in the sense of heuristic rules to improve solution) is involved to refine an individual in order to obtain an improved schedule in the proposed algorithm. Two levels of local search are applied to allocate each operation to a time slot on its machine. We know that idle times may exist between operations on each machine due to precedence constraints among operations of each job, and two levels of local search utilize idle times in different degrees.
The first level local search
let be the starting time of and the completion time of , an example of the first level local search is shown in Figure 3. Because needs to be processed after the completion of , an idle time interval between the completion of and the starting of appeared on machine . is assigned to and we assume that is the last operation on before handling , therefore the starting time of is , which in this example is and it is later than , thus, there is an opportunity that can be processed earlier. When checking the idle time on , the idle time interval is found available for because the idle time span , which is part of , is enough to process or longer than .
Let be the starting time of the th idle time interval on and be the completion time. can be transferred to an earliest possible idle time interval of its machine which satisfies the following equation:
(4) 
After using the idle time interval, the starting time of is and the idle interval is updated based on the starting and completion time of : (1) the idle time interval is removed; (2) the starting or completion time of the idle time interval is modified; (3) the idle time interval is replaced by two new shorter idle time intervals, like in the example of Figure 3.
After decoding a chromosome, the operation sequence vector of the chromosome is updated according to new starting times of operations, and three objective values are calculated. The first level local search only finds for each operation the available idle time interval on its assigned machine. After generating the corresponding schedule with the first level search method, it is possible that there are still operations that can be allocated to available idle time intervals to benefit the fitness value. To achieve this, decoding the chromosome which has been updated with the first level local search is performed with the second level local search, and again operations are moved to available idle time intervals.
The second level local search
The second level local search not only checks the idle time intervals on the assigned machine, but also the idle time intervals on alternative machines. An example of making use of the idle time interval on another machine is shown in Figure 3. Let be the starting time and be the completion time of on . In this example, is assigned to in the initial chromosome, we assume that can also be performed by . Under the condition that the starting time of on is later than the completion time of , the idle time intervals on all alternative machines which can process are checked. An idle time interval on could be a choice and can be reallocated to . In this example, the processing time of on is even shorter then the processing time on , therefore, this reallocation can at least benefit the total workload.
In the second level local search, all available idle time intervals of an operation are checked one by one until the first “really” available idle time interval is found and then the operation is moved to that idle time interval. Any idle time interval on an alternative machine which can satisfy Equation 4 is an available idle time interval, while it must meet at least one of the following conditions to become a “really” available idle time interval.

The processing time of the operation on the new machine is shorter than on the initially assigned machine if the available idle time interval is on a different machine;

The operation can be moved from the machine with the maximal makespan to another machine.

The operation can be moved from the machine with the maximal workload to another machine.
The total workload can be improved directly by the first condition; the motive of the second condition is to decrease the maximal makespan and the third condition can benefit the critical workload.
After the reallocation of the operations with the second level local search, the corresponding schedule is obtained and objective values are calculated. While, instead of updating the chromosome immediately, the new objective values are compared with the old objective values first, the chromosome can be updated only when at lease one objectives is better than its old value. This is to make sure that the new schedule is at least not worse than the old schedule (The new solution is not dominated by the old solution). Another difference between the first and second level local search is that the first level local search is performed on every evaluation, while the second level local search is only performed with a 30% probability for each chromosome to avoid local optima. Although these two local searches can be applied repeatedly to improve the solution, to avoid that the algorithm is stuck in a local optima, they are employed only once for each evaluation.
5 Experiments and results
The experiments are implemented on the MOEA Framework (version 2.12, available from http://www.moeaframework.org). The algorithms are tested on two sets of wellknown FJSP benchmark instances: 4 Kacem instances (ka4x5, ka10x7, ka10x10, ka15x10) and 10 BRdata instances (Mk01Mk10). Table 3 gives the scale of these instances. The first column is the name of each instance; the second column shows the size of the instance, in which stands for the number of jobs and the number of machines; the third column represents the number of operations; the fourth column lists the flexibility of each instance, which means the average number of alternative machines for each operation in the problem.
Instance  n m  #Opr  Flex. 

ka4x5  4 5  12  5 
ka10x7  10 7  29  7 
ka10x10  10 10  30  10 
ka15x10  15 10  56  10 
Mk01  10 6  55  2 
Mk02  10 6  58  3.5 
Mk03  15 8  150  3 
Mk04  15 8  90  2 
Mk05  15 4  106  1.5 
Mk06  10 15  150  3 
Mk07  20 5  100  3 
Mk08  20 10  225  1.5 
Mk09  20 10  240  3 
Mk10  20 15  240  3 
All the experiments are performed with a population size of , each run of the algorithm will stop based on a predefined number of evaluation, which is for Kacem instances and for BRdata instances. For each problem instance, the proposed algorithm is independently run times. The resulting solution set of an instance is formed by merging all the nondominated solutions from its runs.
The crossover probability is set to and two random crossover operators can be chosen each time (one for operation sequence and one for machine assignment). For Kacem instances, the mutation probabilities are set to . For BRdata instances, which include largerscale and more complex problems, the MIPEGO configurator [8] is adopted to tune both insertion and swap mutation probabilities (one point swap mutation and two points swap mutation) to find the best parameter values for each problem. The hypervolume of the solution set has been used in MIPEGO as the objective value to tune three mutation probabilities. Although the true Pareto fronts (PF) for test instances are unknown, [4] provides the reference set for Kacem and BRdata FJSP instances, which is formed by gathering all nondominated solutions found by all the implemented algorithms in [4] and also nondominated solutions from other stateoftheart MOFJSP algorithms. We define the reference point for calculating the hypervolume value based on the largest value in this reference set. To be specific, each objective function value of the reference point is: largest objective function value of the respective dimension in the reference set. The origin point is used as the ideal point. Other basic parameter settings of MIPEGO are listed in Table 4. For each mutation probability, we only consider a discretized number with only one digit after the decimal point, therefore, the search space is ordinal or integer space, which in MIPEGO are handled in the same way.
Parameter  value 

maximal number of evaluations  200 
surrogate model  random forest 
optimizer for infill criterion  MIES 
search space  ordinal space 
With a budget of evaluations, Table 5 shows the percentage of the evaluations which can achieve the largest hypervolume value (or the best PF) by MIPEGO. It can be observed for Mk05 and Mk08 that all the evaluations have obtained the largest hypervolume value, it means that all parameter values of mutation probabilities in MIPEGO can achieve the best PF for these two problems. It can also be seen in Table 3 that both problems have a low flexibility value. On the contrary, for Mk06, Mk09 and Mk10, these problems have a large operation number and high flexibility. It seems that they can be difficult to solve because there is only one best parameter setting for the mutation probabilities. This also means that it is highly likely better solution sets can be found with a higher budget.
Mk01  Mk02  Mk03  Mk04  Mk05  Mk06  Mk07  Mk08  Mk09  Mk10 

With the best parameter setting of the mutation probabilities for BRdata instances, we compared our experimental results with the reference set in [4]. Our algorithm can achieve the same Pareto optimal solutions as in the reference set for all BRdata instances except for Mk06, Mk09 and Mk10. At the same time, for Mk06 and Mk10, our algorithm can find new nondominated solutions. Table 6 is the list of new nondominated solutions obtained by our algorithm, each row of an instance is a solution with three objectives: makespan, total workload, and critical workload.
Mk06  Mk10  

61  427  53  218  1973  195 
63  428  52  218  1991  194 
63  435  51  219  1965  195 
65  453  49  220  1984  191 
66  451  49  225  1979  194 
66  457  48  226  1954  196 
226  1974  194  
226  1979  192  
228  1973  194  
235  1938  199  
236  1978  193 
Another comparison is between our algorithm (FJSPMOEA) and MOGA [6], SEA [3] and MA1, MA2 [4]. In [4], there are several variants of the proposed algorithm with different strategies in the local search. We pick MA1 and MA2 as compared algorithms because they perform equally good or superior to other algorithms on almost all problems. Table 7
displays the hypervolume value of the PF approximation from all algorithms and the new reference set which is formed by combining all solutions from the PF by all algorithms. The highest hypervolume value on each problem in all algorithms has been highlighted in bold. We observed that FJSPMOEA and MA1, MA2 show the best and similar performance, and MOGA behaves the best for three of the BRdata instances. The good performance of MOGA on three problems is interesting. MOGA has a entropybased mechanism to maintain decision space diversity which might be beneficial for solving these problem instances. When using one best parameter setting, we also give the average hypervolume and standard deviation from 30 runs on each problem in Table
8, the standard deviation of each problem shows the stable behaviour of each run.Problem  MOGA  SEA  MA1  MA2  FJSPMOEA  Ref 
Mk01  0.00426  0.00508  0.00512  0.00512  0.00512  0.00512 
Mk02  0.01261  0.01206  0.01294  0.01294  0.01294  0.01294 
Mk03  0.02460  0.02165  0.02165  0.02165  0.02165  0.02809 
Mk04  0.06906  0.06820  0.06901  0.06901  0.06901  0.07274 
Mk05  0.00626  0.00635  0.00655  0.00655  0.00655  0.00655 
Mk06  0.05841  0.06173  0.06585  0.06692  0.06709  0.07065 
Mk07  0.02244  0.02132  0.02269  0.02269  0.02269  0.02288 
Mk08  0.00418  0.00356  0.00361  0.00361  0.00361  0.00428 
Mk09  0.01547  0.01755  0.01788  0.01789  0.01785  0.01789 
Mk10  0.01637  0.01778  0.02145  0.02196  0.02081  0.02249 
Problem  Mk01  Mk02  Mk03  Mk04  Mk05  Mk06  Mk07  Mk08  Mk09  Mk10 

Average HV  
Std 
For Kacem instances and with fixed mutation probabilities, our obtained nondominated solutions are the same as the PF in the reference set. MA1 and MA2 also achieved the best PF for all Kacem instances, but our algorithm uses far less computational resources. The proposed FJSPMOEA uses only a population size of whereas the population size of MA algorithms is . FJSPMOEA uses only objective function evaluations, whereas MA used evaluations. In terms of computational resources the proposed FJSPMOEA can therefore be used on smaller computer systems, entailing broader applicability, and possibly also in realtime algorithm implementations such as dynamic optimization.
6 Conclusions
A novel multiobjective evolutionary algorithm for MOFJSP is proposed. It uses multiple initialization approaches to enrich the first generation population, and various crossover operators to create better diversity for offspring. Moreover, to determine the optimal mutation probabilities, the MIPEGO configurator is adopted to automatically generate proper mutation probabilities. Besides, the straightforward local search is employed with different levels to aid more accurate convergence to the PF. The proposed customization approach in principle can be combined with almost all MOEAs. In this paper, we incorporate it with one of the stateoftheart MOEAS, namely NSGAIII, to solve MOFJSP, and the new algorithm can find all Pareto optimal solutions in literature for most problems, and even new Pareto optimal solutions for the large scale instances.
In this paper, we show the ability of MIPEGO in finding the optimal mutation probabilities. However, there is more potential in the automated parameter configuration domain that can benefit EA. For example, to know the effects of different initialization approaches and crossover operators, we can optimize the initialization and crossover configuration. Furthermore, other parameters of the proposed algorithm, such as, population size, evaluation number, and so on, can also be tuned automatically. However, so far the efficiency of the existing tuning framework is limited when it comes to a larger number of parameters. It would therefore be a good topic of future research to find more efficient implementations of these. Finally, based on the good performance of MOGA on some of the problems, it seems to be interesting for future research to integrate the entropybased selection mechanism also into the MOEA schemes to achieve an even better performance.
References
 [1] Garey, M.R., Johnson, D.S. and Sethi, R., 1976. The complexity of flowshop and jobshop scheduling. Mathematics of operations research, 1(2), pp.117129.

[2]
Deb, K. and Jain, H., 2013. An evolutionary manyobjective optimization algorithm using referencepointbased nondominated sorting approach, part I: solving problems with box constraints. IEEE transactions on evolutionary computation, 18(4), pp.577601.
 [3] Chiang, T.C. and Lin, H.J., 2013. A simple and effective evolutionary algorithm for multiobjective flexible job shop scheduling. International Journal of Production Economics, 141(1), pp.8798.
 [4] Yuan, Y. and Xu, H., 2013. Multiobjective flexible job shop scheduling using memetic algorithms. IEEE Transactions on Automation Science and Engineering, 12(1), pp.336353.
 [5] Chaudhry, I.A. and Khan, A.A., 2016. A research survey: review of flexible job shop scheduling techniques. International Transactions in Operational Research, 23(3), pp.551591.
 [6] Wang, X., Gao, L., Zhang, C. and Shao, X., 2010. A multiobjective genetic algorithm based on immune and entropy principle for flexible jobshop scheduling problem. The International Journal of Advanced Manufacturing Technology, 51(58), pp.757767.
 [7] Deb, K., Pratap, A., Agarwal, S. and Meyarivan, T.A.M.T., 2002. A fast and elitist multiobjective genetic algorithm: NSGAII. IEEE transactions on evolutionary computation, 6(2), pp.182197.
 [8] van Stein, B., Wang, H. and Bäck, T., 2018. Automatic configuration of deep neural networks with EGO. arXiv preprint arXiv:1810.05526.
 [9] Das, I. and Dennis, J.E., 1998. Normalboundary intersection: A new method for generating the Pareto surface in nonlinear multicriteria optimization problems. SIAM journal on optimization, 8(3), pp.631657.
 [10] Zhang, G., Gao, L. and Shi, Y., 2011. An effective genetic algorithm for the flexible jobshop scheduling problem. Expert Systems with Applications, 38(4), pp.35633573.
 [11] Kacem, I., Hammadi, S. and Borne, P., 2002. Approach by localization and multiobjective evolutionary optimization for flexible jobshop scheduling problems. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 32(1), pp.113.
 [12] Pezzella, F., Morganti, G. and Ciaschetti, G., 2008. A genetic algorithm for the flexible jobshop scheduling problem. Computers & Operations Research, 35(10), pp.32023212.
 [13] Xing, L.N., Chen, Y.W. and Yang, K.W., 2009. An efficient search method for multiobjective flexible job shop scheduling problems. Journal of Intelligent Manufacturing, 20(3), pp.283293.
 [14] Glover, F.W. and Kochenberger, G.A. eds., 2006. Handbook of metaheuristics (Vol. 57). Springer Science & Business Media.
 [15] Zhang, C.Y., Rao, Y.Q., Li, P.G. and Shao, X.Y., 2007. Bilevel genetic algorithm for the flexible jobshop scheduling problem. Chinese Journal of Mechanical Engineering, 43(4), pp.119124.

[16]
Zhang, C., Li, P., Rao, Y. and Li, S., 2005, March. A new hybrid GA/SA algorithm for the job shop scheduling problem. In European Conference on Evolutionary Computation in Combinatorial Optimization (pp. 246259). Springer, Berlin, Heidelberg.
 [17] Teekeng, W. and Thammano, A., 2012. Modified genetic algorithm for flexible jobshop scheduling problems. Procedia Computer Science, 12, pp.122128.
Comments
There are no comments yet.