I Introduction
In this paper, we study a trailer management process problem from a big FastMoving Consumer Goods company in Belgium. The company faces problems with managing their transportation trailer fleet among the different production sites. Complexities in the trailer management process include nonhomogeneous stochastic demands differing both during the day and during the week, different types of trailers that are used for shipments, the hybrid usage of external and owned trailers and multiple dependent locations managing a shared resource pool with individual needs and preferences. As a result of these complexities, situations occur in which no trailers are available when needed, which negatively impact the logistic processes. In order to improve the general trailer management process, a simulation model was developed that allows analysis of the potential impact of fleet sizing and other trailer management decisions on the overall process performance. However, finding an optimal configuration of the trailer management process is difficult, as an exhaustive search of the process parameters is infeasible due to the size of the solution space. In this paper, we propose a simulation optimisation (i.e., SO) based approach to optimise the configuration of trailer management process.
Recent works on simulations with infinite or, finite but having many parameter combinations, have focused on the development of simulation optimisation algorithms that combine metaheuristics with fitness approximation models [1, 2, 3, 4, 5]. These fitness approximations, also called metamodels are subsequently used to replace simulated fitness evaluations or to select which solutions should undergo a simulation procedure.
In this work, we propose an online simulation optimisation approach (see Figure 1), where a feedforward neural network is built to approximate the trailer management simulation model, which is used as the metamodel to filter solutions. A genetic algorithm is developed to find improved solutions (i.e., configurations) over iterations. Compared to other approaches of simulation optimisation (e.g. [1, 3]), our contributions are summarised as below.

While simulation optimisation approaches have shown to be effective in various studies, little research has been done on the impact of the parameters of the different components of the optimisation model on the effectiveness of the model. We investigate such impact in this paper.

One main concern of the general hybrid simulation optimisation methodology is the lack of convergence property and the relative high chance that optimal solutions get rejected due to statistical noise in the inputoutput relations of the simulation model and metamodel accuracy [3]. In this paper, we consider an ensure probability factor that forces evaluation of solutions even when having high approximation fitness scores, which leads to improved performance based on our experimental results.

Few existing works have considered simulation optimisation techniques combined with a machine learning approximation model in supply chain applications using realworld data [6]. We evaluate the proposed methodology on a realworld trailer management.
The rest of our paper is organised as follows. Section II describes the simulation model objective function. In Section III we present the proposed algorithm and show how the metaheuristic and neural network are combined in the simulation optimisation algorithm. Section IV evaluates the performance of NN in approximating simulation fitness values, and Section V evaluates the impact of the parameters of the simulation optimisation algorithm. Lastly, in Section VI we present the results of the simulation optimisation algorithm in our experiments using the trailer management simulation model.
Ii Simulation Model Objective Function
The objective function aims to minimise the trailer and parking related costs while ensuring trailer and parking availability levels. Trailer costs include the estimated trailer costs of external shipments, the rent cost of owned trailers and the costs of defects. Parking costs include the estimated cost of additional parking places.
Trailer availability and parking availability can be considered as constraints in the optimisation problem. High unavailability of either trailers or parking results in additional costs in the logistics process and delay other tasks. The simulation model measures the frequency of orders waiting for more than 12 hours to be linked to a trailer. As it can be assumed that linking trailers and orders takes no time, this effectively means that an order could not start processing due to the lack of availability of trailers.
Notation  Description  

Number of vehicles of type at location  


Number of parking places at location  
Number of vehicles of type , i.e.  
Number of missed calls of vehicle type  
Number of full park occurrences for vehicle type  
Number of shipments on external veh. of type (dep. on )  
Number of defects  
Vehicle type {1 = Trailers, 2 = Tankers}  
Location {1 = Leuven, 2 = Jupille, 3 = Hoegaarden} 
In the simulation model, constraint violations are modelled via a penalty function (soft constraints) that makes the optimisation problem unconstrained. A linear penalty function was implemented as a cost term in the original objective function [7]. This penalty function is described in Equation (1). In this function denotes the number of missed calls (or delays) for vehicle type ; is the estimated cost for a missed call; is the number of occurrences of a full parking and is the cost of such occurrence.
(1) 
Equation (2) shows the complete cost function
for a vector of simulation input parameters
, composed of variables , and , where and . That is, is a function of and consists of values for the amount of owned vehicles at each plant, the multiplication factor for external vehicles at each plant and the amount of parking capacity. Since these are the variables considered as input to the proposed model (individuals for the genetic algorithm) we denote as to avoid notation clutter.Notation  Description  Value 

Cost of owned trailer  2050  
Cost of owned tanker  1500  
Cost of shipment on external trailer  40  
Cost of shipment on external tanker  40  
Cost of parking  1500  
Cost of defect  500  
Cost of delayed order  200  
Cost of parking full  200 
Our objective aims to minimise . In this cost function, is the total number of vehicle resources (trailers or tankers) of type . equals the cost of an owned trailer of type . equals the total number of shipments on external vehicles of type and is the cost of a single shipment on an external vehicle of type . The parameter equals the amount of parking spots at site and equals the cost per parking spot. The total costs of defects are determined by the number of defects and the estimated cost per defect . Finally, the penalty cost as derived in Equation 1 is added to the cost function. The description of the notation can be found in Table I.
(2) 
We note that the cost function incorporates both inputs and outputs of the simulation model. The number of vehicles and parking places are direct inputs in the simulation model while the number of defects, the number of shipments performed on external vehicles and the penalty costs are dependent on the outcomes of the simulation.
The defect and trailer costs were derived from the average historical costs per year. On the other hand, tanker costs were approximated due to the lack of observed historical data. Estimated values were set to slightly lower than trailers costs to reduce their priority in the optimisation process. Delay costs were estimated using expert knowledge, and full parking costs were set to the same level as to have the same importance as delay costs. The complete list of costs used in the objective function can be found in Table II.
Iii Simulation Optimisation Algorithm
In this section, our proposed approach is presented detailing its three main components: (i) a simulation model; (ii) an approximation model and (iii) a metaheuristic optimisation model. The procedure is outlined in Figure 1.
The simulation model determines the quality (or fitness) for a given input via the objective function described in Section II. We use the simulation model to create an initial training dataset that contains the fitness for different input values (i.e. different choices for ). This training dataset is used to train a FeedForward Neural Network (FNN) model to approximate the simulation model. Finally, we use a Genetic Algorithm (GA) to encode the simulation optimisation problem of choosing the best simulation input values. Our approach can be considered as an online simulation optimisation algorithm that continuously uses the outputs of the simulation evaluations. The grey box outlines the part of the algorithm that runs continuously until one of the stopping criteria is reached (see Section IIIA). We now elaborate on the components (ii) and (iii) of our approach.
Iiia Genetic Algorithm
A GA is used as metaheuristic to search for the best configuration of the input values . Each individual in the population of the GA represents a configuration of the simulation model, where each of the genes is one of the decision variables (e.g. a number of vehicles of different types at location and parking places at ).
Initialisation
The initial population is generated as follows. First, a set of simulations is performed on randomly chosen input values . The inputs and outputs of the simulation are used to train a global approximation model on the solution space. Subsequently, the initial population of the genetic algorithm is created by selecting the input values with the lowest fitness values.
Iteration
An iteration of the GA proceeds as follows. Let denote the best fitness value found so far based on the output from the simulation model. During an iteration, the fitness of each individual is approximated using a FNN model. This gives approximations for every individual in the population. These approximations are checked against the fitness of the best actual simulated solution found so far . The actual fitness for an individual , is denoted as . The approximated solutions are compared to the best solution so far . When the approximation of the solution is smaller than or equal to the best solution so far plus a predefined threshold value , i.e., the individual is simulated to get the real output . Additionally, there is a probability that an individual will be simulated regardless of its approximated fitness value. This probability is introduced to ensure better convergence of the algorithm, and it is further discussed in Section VC. Whenever a solution is simulated, its inputs and outputs are stored so that they can be used in future retraining of the machine learning model. Subsequently, crossover (pointwise), elitism and mutation probabilities are applied in order to generate new individuals in the population.
Based on our experiments, we decided to stop the GA based on a tworule stopping criterion. The optimisation stops when either 10 generations of the GA have been performed or when 300 simulations have been completed. These criteria were chosen to add a bound to the total available time for optimisation as each generation may result in a number of simulation evaluations equal to the total population size.
IiiB Neural Network Approximation Model
Similar to other works in simulation optimisation [1, 2, 3, 4] we select a FeedForward Neural Network (FNN) due to its capabilities as a universal function approximator [8]. The FNN approximation model uses input values of the simulation model (an individual in the genetic algorithm) to predict the value of the cost function defined in Section II. The subsequent approximations of an individual output approximations .
During training of the FNN, we normalise the input parameters to allow for better convergence. We perform grid search using
fold cross validation to select the best performing FNN hyperparameters. The number of hidden layers and the level of
regularisation on trainable weights are used during grid search. Each neural network is trained using the rectified linear unit (ReLU) as activation functions and has its weights optimised by the Adam
[9] algorithm for stochastic gradient updates. After grid search, a final network is trained on the full training set (i.e. % of the current total dataset).IiiC Simulation Optimisation Procedure
After defining the GA and the FNN models we can finally describe the Simulation Optimisation (SO) algorithm procedure:

Create initial random simulations to gather training data for FNN model as described in Section IIIB.

Create an initial population of size () consisting of individuals with the lowest fitness from the random simulations.

Train an FNN model using the simulation outputs (Section IIIB).

Apply genetic operators to create a new population.

Check for each of these new individuals whether their approximated fitness is lower than the best score found so far plus a threshold value. If so, simulate the individual to get the simulation evaluation, otherwise let the ensure probability decide whether or not the solution will be simulated.

At the end of each generation add the simulated individuals with their fitness to the training set and repeat steps 36.

Continue until one of the stopping criteria has been reached (see Section IIIA).
Iv Neural Network Approximation Model Analysis
The performance of the simulation optimisation method heavily depends on the performance of the approximation model. We use two criteria to evaluate the performance of the approximation model: (i) the accuracy of the approximations and (ii) the required amount of training data.
The accuracy is a necessary measure as inaccurate approximations may lead to inaccurate decisions of the GA. On the other hand, the required amount of training data is essential as this impacts the effectiveness of the proposed approach. That is, to generate more training data one would need to simulate more input values. In the case of the FNN requiring a large amount of training data to achieve reasonable accuracy, an approximation model would become more costly than running the original simulation model.
To test the FNN performance, we randomly select configurations for the input values and run the simulation model to obtain the corresponding fitness . Next, we create datasets ranging from from size 50 to 2000 in steps of 50 training examples. For each dataset, a validation set containing % of the total examples was separated. The remaining % was used in the grid search setup (with parameters from Table III) using 10fold crossvalidation to select the best performing hyperparameters.
Parameter  Values 

( regularisation)  0.05, 0.01, 0.005, 0.0001, 0.00001 
Hidden layers  (15), (25), (50), (50, 5), (50, 25, 15), (25, 15) 
The approximation performance is quantified using the coefficient of determination () and the Mean Absolute Error (MAE) metrics in Table IV. We note that, on average, the approximation model can achieve close approximations of the simulation model. Furthermore, when the size of training data is large enough, i.e. more than samples, most approximation models have an of at least and an MAE lower than . With fitness scores ranging between and , this results in MAE errors being less than of the fitness values. We argue that the MAE could form a basis for the threshold of the optimisation algorithm. Moreover, in general increasing the dataset size above 500 did not lead to a substantially higher goodnessoffit.
MAE  

Mean  0.934  64,405.1 
Std. Dev.  0.034  14,717.0 
V Simulation Optimisation Parameter Analysis
In this section we describe the experiments undertaken to define the best performing parameters of the SO method.
Va Experimental Setup
The impact of changing different parameters on the optimisation performance is tested using the same set of base parameters for each experiment. The base parameters are designed to use both the threshold and ensure probability mechanics. To ensure comparability, the same initial random training simulations are used for all experiments. Table V shows the base case scenario for the experiments performed.
Parameter 


Threshold () 




Value  100  0.01  100,000  0.2  0.5  2 
VB Impact of population size
In general, a large population allows for more diversity in the gene pool. However, it also results in a higher number of simulation evaluations. This results in slower running times due to the significant time spent in the simulation model. This effect can potentially cause wasted evaluations on solutions with low fitness.
To check the impact of the population size, we performed experiments for population sizes () of , and individuals. We start the SO procedure with individuals with the best (lowest) fitness and proceed as described in Section IIIC. In Table VI the total number of simulation evaluations compared to the bestfound fitness are presented. In our experiments, the population size of showed the best fitness performance at and yielded the lowest number of simulation evaluations. We note that it is important to consider that diversity does however not always mean higher population fitness. Instead, our algorithm favours smaller population sizes as it leads to fewer simulation evaluations and exploring more potential solutions.
VC Impact of threshold and ensure probability
In optimisation using genetic algorithms, exploration of the solution space can be achieved using mutation and crossover operations. However, for our proposed SO method, these operators alone do not necessarily ensure proper exploration. Evaluations are only performed on individuals that have a certain approximation fitness level. That is, what is considered to be a good enough approximated score is dependable on the best fitness found so far. This procedure biases the search for low approximated solutions. Therefore, allowing the model to (re)visit every possible solution is crucial in ensuring a proper search of the solution space.
The threshold parameter is a mechanism to account for the difference between approximated and real simulation output. This parameter influences which solutions undergo simulation evaluation, and it is crucial in achieving convergence. Ideally, the threshold should reduce the required amount of simulations while minimally impacting the convergence of the algorithm. As the threshold is used in collaboration with the best fitness score found thus far, we may have a scenario where solutions with high approximated fitness will never be visited. To avoid only visiting solutions that are similar in fitness quality an ensure probability was added to the SO method. During an iteration, mutation operators are used to change the chromosomes of an individual. Thus, there is a probability for each feasible solution to be visited. The ensure probability then forces the evaluation of a low quality an individual to potentially attempt to introduce more population diversity and escape local optima.
The threshold levels were determined based on the accuracy (MAE) of the FNN. Experiments were performed for a threshold of (no threshold), (slightly below MAE), (higher than MAE) and (much higher than MAE). In Table VI the proposed thresholds, alongside the bestfound fitness score and the number of simulation evaluations are presented. We observe that a lower threshold is beneficial in finding better solutions. The best performance is achieved at threshold , but this is only marginally better than the performance of threshold and . A more significant difference exists on the performance of the scenario. Therefore, we conclude that a more restrictive threshold yields lower fitness solutions in less time. Furthermore, using no threshold results in the lowest number of evaluations but also shows a slightly worse performance. This suggests that using the threshold can indeed improve solution quality.
We compare the impact of the ensure probability by executing experiments using the configuration defined in Table V except for . We test probabilities values of: , and . In Table VI the results are shown for each of the probabilities. It can be seen that a small ensure probability will, in fact, improve performance over only using a threshold based approach. We also note that when the ensure probability is higher, the actual performance decreases. This is due to too many unnecessary simulation evaluations which heavily impact the limited simulation budget provided by the stopping criteria, i.e. evaluations.
VD Impact of mutation probability
Mutation probability impacts the optimisation model by allowing additional exploration of the solution space outside of the existing gene pool. In our experiments three different probabilities: (low), (medium) and (high) were tested. In Table VI the results are shown for each of the probabilities. It can be seen that a mutation probability of yields the lowest fitness values while performing less than the total amount of simulation evaluations.
Parameter  Values 



Population Size  100  1,221,940  256  
500  1,271,201  300  
1,000  1,250,732  300  
Threshold ()  0  1,225,749  140  
50,000  1,223,542  161  
100,000  1,224,132  263  
150,000  1,235,804  292  
Ensure Probability  0.0  1,232,647  145  
0.01  1,225,654  257  
0.1  1,232,839  300  
Mutation Probability  0.1  1,235,547  125  
0.2  1,222,327  262  
0.3  1,252,049  300 
VE Comparison against a single global approximation model
In [4], the proposed method uses only the machine learning approximations as fitness evaluations in the SO procedure. That is, the algorithm starts by running several simulations and calculating their simulated fitness. This creates a sample of inputoutput pairs that can subsequently be used for training. Unlike our proposed method, machine learning is not used to filter which solutions to reject. Instead, it completely replaces the original fitness evaluation method. The optimisation is subsequently done only by using the approximated fitness. This effectively eliminates the use of simulation evaluations while running SO.
We point out that this modification results in no simulation evaluations during the optimisation steps. Thus, it is unknown to the SO whether the population fitness is actually improving. This modification also requires a more accurate approximation model. Therefore, it is often desired that more training data is used during FNN training.
We compare our SO method to a modified version based on [4]. To achieve the same effect, we prevent prevent any simulation evaluations and FNN retraining during a run. We note that this modification saves considerable running time of the SO as simulation evaluations and FNN training are only performed once. In our tests, we train the approximation FNN using and initial training examples, using the same procedure as shown in Section IIIC.
At the end of the modified SO run, only a limited amount of simulation evaluations are performed. Respectively simulation evaluations in the scenario with initial training examples, and simulation evaluations in the scenario with initial examples. In our experiments, the best fitness values were respectively for 1000 training examples and for 2000 training examples. However, the total running time of this modified SO is much longer. The scenario with cases would take roughly hours to generate the training set. Even though the modified SO procedure only takes a couple of minutes in the GA step, the original SO method resulted in similar solution quality while requiring less total running time.
VF Comparison of the optimisation model against random simulation evaluations
As a baseline, we compare the SO to randomly chosen simulation individuals over several runs. In our experiments, we tested random individuals containing , , , and evaluations. The results showed that the best fitness found over all experiments was . Moreover, the best fitness score did not improve after evaluations. When compared to the proposed SO model, regardless of its configuration, it is possible to find multiple better performing individuals using fewer simulation evaluations. Therefore, we argue that the optimisation model offers a significant improvement over a random selection of parameters both in solution quality and running times.
Vi Results
Via Fitness and running times
Based on the experiments performed in Section V, we report the best performing parameters for the proposed SO algorithm in Table VII. We note that the number of elites (i.e. the number of best performing individuals passed from one generation to the next) was not explicitly tested in our experiments. We argue that any number of elites should yield a better GA convergence [10]. Therefore, we select the number of elites based on the desired running times. Also, we report that the training set size for the initial FNN model contains 1,000 training examples.
Parameter 


Threshold 




Value  100  0.01  50,000  0.2  2 
Figure 2 depicts the the predicted and real simulation values for the proposed parameters in Table VII. The best performing solution selected 110 trailers and 28 tankers in Leuven with multiplication factors and for external trailers and tankers. One hundred thirtyfive trailers and two tankers in Jupille, with multiplication factors and for trailers and tankers respectively. Furthermore, trailers and nine tankers were selected for Hoegaarden with multiplication factor for external trailers. Parking spaces were set to in Leuven, in Jupille and in Hoegaarden. We note that the best solution suggested an increased amount in parking capacity at the different sites when compared to the current situation. This effect is due to the relatively low cost of adding parking places in comparison to the high cost of parking conflicts. The fitness value of this configuration equals .
The running time required to find this solution can be estimated using the number of simulation evaluations and retraining of the FNN model. Initially, the training set contains simulated examples. After initialisation, simulation evaluations were performed in the optimisation cycle, and the FNN is trained ten times. In our experiments, one simulation takes approximately seconds, and one model selection instance takes minutes. Therefore, the total running time is minutes (or hours). It is important to note that over of this time is attributed to the generation of the initial training set. Subsequent simulation evaluations consume of the running times while FNN training takes of the total time.
ViB Sensitivity Analysis
We perform a sensitivity analysis on the bestfound solution. We increment and decrement each variable in a solution by one and two units. This procedure yields 55 different solutions. We calculate the fitness values for each solution and report statistics of the results in Table VIII.






1,216,892  1,225,190  1,220,241  1,547.2 
We note that in Table VIII, that the minimum fitness is lower than the solution found by the optimisation algorithm. This result shows that the solution found by the SO is a local optimum. In this specific solution, adding one more tanker vehicle in Hoegaarden decreases the value. This change suggests that a local search procedure could improve the SO algorithm. Moreover, we point out that minimum and maximum fitness only differ slightly from the fitness found by the SO method and standard deviation takes very low values. In other words, the solution found by the simulation optimisation algorithm is not susceptible to small changes in the input parameters.
Vii Conclusions
This paper studies how machine learning can be applied to optimise input parameters for a simulation model on trailer fleet configuration at an FMCG in Belgium. We train a feedforward neural network to approximate the solutions of the simulator, which is then used to evaluate the fitness values of the genetic algorithm. We demonstrated the effectiveness of the proposed simulation optimisation method. In the optimised configuration, nearly no trailer unavailability problems persist and parking capacity problems are solved as well. The solution found by the SO method reduces the amount of owned tankers and increases the amount of owned trailers. Furthermore, it increases the amount of both external trailers and tankers. The parking is increased at each of the locations to facilitate the extra resources deployed. The higher capacity significantly reduces the trailer availability problems, and the larger parking nearly completely solves the parking problem.
References
 [1] R. D. Hurrion, “A sequential method for the development of visual interactive metasimulation models using neural networks,” Journal of the Operational Research Society, vol. 51, no. 6, pp. 712–719, 2000.
 [2] M. Laguna and R. Marti, “Neural network prediction in a system for optimizing simulations,” Iie Transactions, vol. 34, no. 3, pp. 273–282, 2002.
 [3] J. April, F. Glover, J. P. Kelly, and M. Laguna, “Simulationbased optimization: practical introduction to simulation optimization,” in Proceedings of the 35th conference on Winter simulation: driving innovation. Winter Simulation Conference, 2003, pp. 71–78.
 [4] R. R. Barton and M. Meckesheimer, “Metamodelbased simulation optimization,” Handbooks in operations research and management science, vol. 13, pp. 535–574, 2006.
 [5] S. Amaran, N. V. Sahinidis, B. Sharda, and S. J. Bury, “Simulation optimization: a review of algorithms and applications,” Annals of Operations Research, vol. 240, no. 1, pp. 351–380, 2016.
 [6] W. AboHamad and A. Arisha, “Simulationoptimisation methods in supply chain applications: a review,” Irish Journal of Management, vol. 30, no. 2, p. 95, 2011.

[7]
C. A. C. Coello, “Theoretical and numerical constrainthandling techniques used with evolutionary algorithms: a survey of the state of the art,”
Computer methods in applied mechanics and engineering, vol. 191, no. 11, pp. 1245–1287, 2002.  [8] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural networks, vol. 2, no. 5, pp. 359–366, 1989.
 [9] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [10] G. Rudolph, “Convergence analysis of canonical genetic algorithms,” IEEE transactions on neural networks, vol. 5, no. 1, pp. 96–101, 1994.
Comments
There are no comments yet.