Many optimization problems in the real world involve multiple optimization functions which conflict with each other and change over time. These dynamic optimization problems are called Dynamic Multi-objective Optimization Problems (DMOPs). For example, in the design of job scheduling systems, a number of decision variables, such as procedures, components, and operation time, are involved, which determine objective functions of energy consumption, production, and stability. These conflicting objective functions always change with time. Hence, efficient DMOAs should rapidly arrange scheduling schemes according to the changing environments, and this ability is critical to robust scheduling systems.
In recent years, in order to solving DMOPs, a variety of DMOAs have been proposed. These existing methods can be roughly grouped into the following three categories: The first category of DMOAs is based on maintaining diversity. Gong et al.  proposed a general framework to decompose decision variables into two subpopulations according to the interval similarity between each decision variable and interval parameters, and a strategy on the basis of change intensity is adopted to track the POF. In , Jiang et al.
developed a framework based on domain adaptive and non-parametric estimation to keep the exploration-exploitation of DMOPs in terms of temporal and spatial views. The second is a memory-based method. Chenet al.  implemented a dynamic two-archive strategy to simultaneously maintain two co-evolving populations. One population is concerned on convergence while the other focuses on diversity. Branke et al.  proposed a memory scheme to enhance the evolutionary process. In this algorithm, some excellent solutions are saved which can be used for guiding towards to optimal solutions. The third category of DMOAs is based on prediction. Muruganantham et al. 
presented a population prediction strategy based on the Kalman filter technique. The Kalman filter technique
can guide the search for new Pareto-optimal solutions to generate a large number of high-quality initial individuals. Then, the algorithm finds the optimal at this moment based on a decomposition-based differential evolution algorithm. Ronget al.  presented a prediction model to track the moving POS by clustering the whole population into several subpopulations. In addition, the number of clusters depends on the intensity of environmental change. Zhou et al.  proposed a population prediction method to predict a whole population instead of predicting some isolated points. The algorithm uses center points to predict the next center point, and the previous manifolds are used to estimate the next manifold. The optimal population at this moment is determined based on a decomposition-based differential evolution algorithm. Hu et al. 
designed a promising approach based on Incremental Support Vector Machine (ISVM)classifier in solving DMOPs, the ISVM is trained from the past Pareto-optimal set, then high-quality initial individuals are filtered through the trained ISVM. Jiang et al.  presented a framework based on transfer learning  to predict an effective initial population for solving DMOPs. The transfer component analysis (TCA) is used in this framework for the domain adaptation problem.
Traditional machine learning approaches are usually based on the assumption that the samples follow the Independent Identically Distributed (IID). Nevertheless, this hypothesis will be broken when dealing with DMOPs, since the solution distribution fails to satisfy the IID hypothesis. Although there is a DMOA based on transfer learning. However, it leads to poor diversity when samples clustering in the high dimensional latent space created by TCA.
In this paper, a regression transfer learning prediction based DMOA (RTLP-DMOA) is proposed. The algorithm aims to generate an excellent initial population to enhance the ability of existing multi-objective optimization algorithms for DMOPs. When the environment has changed, a regression transfer learning prediction model is constructed by utilizing the historical population information which can predict objective values in the new environment. Then, with the assistance of this regression prediction model, some high-quality solutions with better predicted objective values can be identified and selected as an initial population, which can improve the individuals’ performance of the evolutionary process significantly.
The contributions of this work are as follows: 1) The proposed algorithm can make full use of historical information and predict high-quality initial population to improve the evolutionary performance of the existing static multi-objective optimization algorithms (SMOAs) in solving DMOPs. 2) The proposed algorithm can overcome the difficulty that solution distributions fail to meet the IID hypothesis. Compared with other prediction methods, the RTLP-DMOA is promising.
The rest of the paper is organized as follows: In Section II, we describes the basic concepts of DMOPs and presents the related transfer learning method used in the RTLP-DMOA. Section III gives the designed RTLP-DMOA in detail. In Section IV, experimental results and analysis are shown. Conclusions are drawn in Section V.
Ii Preliminary Studies
Ii-a Dynamic Multi-objective Optimization
The mathematical form of DMOPs is as follows:
where , and is the -dimensional decision vector, and is the environment variable. is the -dimensional objective vector. The goal of DMOAs is to find solutions at environment so that all objectives are as small as possible. Nevertheless, one solution cannot satisfy the minimum of all conflicting objectives. Hence, a trade-off method called Pareto dominance is introduced to compare these solutions. The set of optimal trade-off solutions is called the Pareto-optimal solutions (POS) in the decision space and the Pareto-optimal front (POF) in the objective space.
(Dynamic Decision Vector Domination) At environment , a decision vector Pareto-dominates another vector denoted by , if and only if
(Dynamic Pareto-Optimal Set, DPOS) If a decision vector at environment satisfies
then all are called dynamic Pareto-optimal solutions, and the set of dynamic Pareto-optimal solutions is called the dynamic POS (DPOS).
(Dynamic Pareto-Optimal Front, DPOF) DPOF is the Pareto-optimal front of the DPOS for the DMOPs at the environment
TrAdaboost  is a classification algorithm based on the boosting method. The aim of TrAdaBoost is to filter out dissimilar samples in the past source domain to those in the target domain. In this way, TrAdaboost improves the classification accuracy. The source data set is combined with the target domain set to form a single data set. At each boosting step, TrAdaBoost increases the relative weights of target instances that are misclassified. When a source instance is misclassified, however, its weight is decreased. In this way, TrAdaBoost makes use of those source instances that are most similar to the target data while ignoring those that are dissimilar. In , the authors introduce TrAdaboost-based algorithms for transfer regression task, called TrAdaboost.R2.
TrAdaboost.R2 is an ensemble method in which each weak regression hypothesis () can map the source domain data set and the target domain data to . A strong regression hypothesis is determined by combining these weak hypotheses. In each training round, TrAdaboost.R2 increases the relative weights of instances from the target domain. Meanwhile, TrAdaboost.R2 decreases the weights of the instances from the source domain. When the regression error of a instance caused by is large, has a substantial influence on the changing weight of the instance. In this way, TrAdaboost.R2 reuses source instances that are most similar to the target data and ignores those that are dissimilar. In the next round, these modified weights are inputed into the next regression hypothesis , instances that are dissimilar to the target domain weaken their impacts of learning process, and instances with large weights help the learning algorithm in training better regressions.
Iii Proposed Algorithm
The framework of RTLP-DMOA is illustrated in Algorithm 1. In brief, RTLP-DMOA initializes randomly a population with size , and then executes a SMOA to optimize the population at environment . If environmental changes are detected, the environment variable is updated as . Then, the last population is inputted into the procedure of regression transfer. In the procedure of regression transfer, a regression hypothesis is determined with historical information which can predict objective vectors of individuals at the new environment. Next, in the procedure called initial population prediction, the is employed to predict the objective vectors and some high-quality individuals are selected according to their predictive objective vectors. These individuals are regarded as an excellent initial population and inputted into a SMOA to accelerate the evolutionary process. The details of RTLP-DMOA are presented in the following section.
Iii-a Regression Transfer
The regression transfer process returns a strong regression hypothesis for environment . The strong regression hypothesis adapts to the solution distribution at current environment. When an individual is given, outputs a predicted objective vector of . Therefore, in the subsequent process of RTLP-DMOA, an excellent individual with better predicted objective vectors can be selected as a member of the initial population.
The strong regression hypothesis is integrated with several weak regression hypotheses (, is the maximum number of iterations for training). These weak regression hypotheses are trained with the past population information. The last population combined with their objective values are regarded as source domain set . The target domain set is comprised of which is sampled from in the current decision space and their objective values , where and are the lower bound and upper bound of the decision variable at environment . and are combined into a set as the training data.
The process for training weak regression hypotheses is as follows: First of all, the weight vector is initialized as , denotes the weight of for training at environment . In the main training loop, for training , a Support Vector Regression (SVR) is implemented as a basic learner to obtain the weak regression hypothesis from and . Then, the adjusted error of each individual for training is calculated as
where is the maximum error, it is described as
The is bigger when the difference between the predicted object vector and the true objective vector become bigger, and the adjusted error for is calculated as
When is small, becomes smaller. Next, the weight vector is updated according to and : If a training individual from the has a bigger , the individual may be more dissimilar to the distribution of the target domain. Therefore, its training weight must be reduced more. However, if a training individual from the target domain has a bigger , then its training weight should be increased more for to adapt the target domain. So, the weights can be updated as
where , and . In this way, individuals adapted to the solution distribution of the target domain have large weights; otherwise, they have small weights. Then, modified weights are inputted into next SVR to learn . Thus, in the next round, individuals with low weights that are dissimilar to the target domain weaken their impacts of the learning process and those with large weights will help the learning algorithm train better regression hypotheses. These weak regression hypotheses (, ) may gradually adapt to the target domain. After iterations, we obtain the final weak regression hypotheses and combine them to acquire a strong regression .
The details of regression transfer are shown in Procedure Regression Transfer.
Iii-B Initial Population Prediction
In this section, the initial population prediction is utilized to identify some excellent solutions as the initial population with the assistance of .
To begin with, a test population is sampled from , where and are the lower bound and upper bound of the decision variable at environment . Then, objective values are predicted, and the non-dominated front can be determined by fast non-dominated sort according to predicted objective values. Then, we select the first non-dominated fronts as and limit the size of does not exceed the population size . Next, some Gaussian noises are added to until the population size is . The initial population is to accelerate the evolutionary process and improve the evolutionary performance for the current environment.
The details of initial population prediction are presented in Procedure Initial Population Prediction.
Iv-a Compared Algorithms
Iv-B Test Problems
All compared algorithms are evaluated on 8 benchmark DMOPs selected from FDA and DMOP. The FDA benchmark comprises FDA1, FDA2, FDA3, FDA4, and FDA5. The DMOP benchmark contains dMOP1, dMOP2, and dMOP3.
DMOPs is divided into three categories: Type I problem indicates POS changes, but the POF does not change. Type II problem indicates changes in POS and POF. Type III problem implies the POF changes but the POS does not change.
FDA1, FDA4, and dMOP3 belong to Type I problem. FDA3, FDA5, and dMOP2 belong to Type II problem. Type III contains FDA2 and dMOP1.
The dynamics of a DMOP is controlled by
where , , and refer to the generation counter, severity of change, and frequency of change, respectively.
Iv-C Performance Indicators
1) The Inverted Generational Distance (IGD) metric  can measure the convergence of obtained solutions. A small IGD value represents the convergence of the solution is improved. IGD is defined as
where is the true POF of a multi-objective optimization problem, and is an approximation set of POF obtained by a multi-objective optimization algorithm and is the number of individuals in the .
The MIGD metric is a variant of IGD. The MIGD can be described as the average of the IGD values in all environments during a run.
where is a set of discrete time points during a run and is the cardinality of .
2) The Maximum Spread (MS) can quantify the extent of obtained solutions covers the true POF. A large MS value indicates additional coverage for the true POF by solutions obtained by the algorithm. MS is calculated as follows:
where and represents maximum and minimum of -th objective in true POF, respectively; and and represent the maximum and minimum of -th objective in the obtained POF, respectively. This metric is also modified for evaluating DMOAs.
Iv-D Parameter Settings
Parameter settings in RTLP-DMOA are as follows: We set the size of the population to 100 and set the number of iterations for training to 10. The size of and are set to 50 and 500, respectively. We choose RM-MEDA as the SMOA optimizer for RTLP-DMOA, and the number of cluster is 4 in RM-MEDA. The parameters in SVR are set by default.
Consistent with the experimental configuration in this study : We fix the to 10. The frequency of change values are 5, and 10. The number of iterations of compared algorithms is , of which 50 are the number of iterations at the initial time. Hence, in each population of configurations, the problem is changed by times.
Iv-E Experimental Results
Experimental comparison results of RTLP-DMOA with other three state of the art DMOAs. MIGD values and MS values are presented in Tables I and Table II, respectively. The best metric values are highlighted in bold.
MEAN AND STANDARD DEVIATION VALUES OF MIGD METRIC FOR DIFFERENT DYNAMIC TEST SETTINGS
As the experimental results show, in Table I, the proposed RTLP-DMOA performs better than the other three algorithms in 9 out of 16 test instances for MIGD values. It clearly shows that the proposed RTLP-DMOA performs better than the compared algorithms on FDA1, FDA4, FDA5, and DMOP3 under all configurations for the MIGD values. We can find that RTLP-DMOA achieves a good performance of MIGD values for tri-objective problems. This is because the prediction method based on the transfer learning method have a strong ability to explore complicated different solution distributions. However, it performs worse than SGEA for FDA3, DMOP1, and DMOP2 under all dynamic test settings. Experimental results of MIGD values indicate that the proposed RTLP-DMOA maintains better convergence over the other three state of the art DMOAs under most test functions.
It can be clearly found from the Table II that the proposed RTLP-DMOA obtains the best results in 13 out of 16 instances for MS values. Apart from FDA3 and DMOP1, RTLP-DMOA performs better than the compared algorithms under all configurations. It is worth noting that RTLP-DMOA achieves the maximum value of MS on tri-objective problems: FDA4 and FDA5. Nevertheless, RTLP-DMOA is a little worse than SGEA on FDA3. Overall, the diversity of solutions obtained by RTLP-DMOA are extremely better than the other three algorithms in most case.
In this subsection, we perform a comparative experiment to verify whether the combination with the regression transfer learning prediction can improve performance. We compare RTLP-RM-MEDA with RM-MEDA. RM-MEDA is originally used to solve the static multi-objective problem and not applicable for DMOPs. Table III indicates that RTLP-RM-MEDA performs better than RM-MEDA in all test functions at and configuration for MIGD values. The RTLP-RM-MEDA improves the RM-MEDA for MIGD values by 22.66%–96.39%. Table IV indicates that RTLP-RM-MEDA performs better than RM-MEDA in all test instances for MS values. RTLP-RM-MEDA improves the RM-MEDA for MS values by 0.08%–39.88%. The ablation study reveals that the designed regression transfer learning prediction can significantly improve the performance of SMOAs.
This paper has proposed the RTLP-DMOA in solving DMOPs. When the environment has changed, a regression hypothesis which adapts to the solution distribution for predicting objective values is deduced. Then, excellent individuals are identified according to their predicted objective values and selected as an initial population, which can improve the performance of the evolutionary process.
From experimental comparison results, the proposed RTLP-DMOA is very competitive in most test functions. In our future work, we will integrate some advanced machine learning methods into evolutionary computing to enhance the evolutionary performance of existing static multi-objective optimization algorithms and solve the real world problems.
This work was supported by the National Natural Science Foundation of China (Grant No.61673328) and Shenzhen Scientific Research and Development Funding Program (Grant No. JCYJ20180307123637294).
Min Jiang, Yang Yu, Xiaoli Liu, Fan Zhang, and Qingyang Hong, “Fuzzy neural network based dynamic path planning,” in2012 International Conference on Machine Learning and Cybernetics, vol. 1, July 2012, pp. 326–330.
-  C. Raquel and X. Yao, “Dynamic multi-objective optimization: a survey of the state-of-the-art,” in Studies in Computational Intelligence. Springer Science mathplus Business Media, 2013, pp. 85–106.
-  Y. Yang, Y. Sun, and Z. Zhu, “Multi-objective memetic algorithm based on request prediction for dynamic pickup-and-delivery problems,” in Evolutionary Computation, 2017.
-  W. Du, W. Zhong, Y. Tang, W. Du, and Y. Jin, “High-dimensional robust multi-objective optimization for order scheduling: A decision variable classification approach,” IEEE Transactions on Industrial Informatics, vol. 15, no. 1, pp. 293–304, Jan 2019.
-  D. Gong, B. Xu, Y. Zhang, Y. Guo, and S. Yang, “A similarity-based cooperative co-evolutionary algorithm for dynamic interval multi-objective optimization problems,” IEEE Transactions on Evolutionary Computation, pp. 1–1, 2019.
-  M. Jiang, L. Qiu, Z. Huang, and G. G. Yen, “Dynamic multi-objective estimation of distribution algorithm based on domain adaptation and nonparametric estimation,” Information Sciences, vol. 435, pp. 203 – 223, 2018.
-  R. Chen, K. Li, and X. Yao, “Dynamic multiobjectives optimization with a changing number of objectives,” IEEE Transactions on Evolutionary Computation, vol. 22, no. 1, pp. 157–171, Feb 2018.
-  J. Branke, “Memory enhanced evolutionary algorithms for changing optimization problems,” in Proceedings of the 1999 Congress on Evolutionary Computation. Institute of Electrical and Electronics Engineers.
-  A. Muruganantham, K. C. Tan, and P. Vadakkepat, “Evolutionary dynamic multiobjective optimization via kalman filter prediction,” IEEE Trans. Cybernetics, vol. 46, no. 12, pp. 2862–2873, 2016.
-  G. Welch, Kalman Filter. Boston, MA: Springer US, 2014, pp. 435–437.
-  M. Rong, D. Gong, Y. Zhang, Y. Jin, and W. Pedrycz, “Multidirectional prediction approach for dynamic multiobjective optimization problems,” IEEE Transactions on Cybernetics, pp. 1–13, 2018.
-  A. Zhou, Y. Jin, and Q. Zhang, “A population prediction strategy for evolutionary dynamic multiobjective optimization,” IEEE Transactions on Cybernetics, vol. 44, no. 1, pp. 40–53, 2014.
-  W. HU, M. JIANG, X. Gao, K. C. TAN, and Y. Cheung, “Solving dynamic multi-objective optimization problems using incremental support vector machine,” in 2019 IEEE Congress on Evolutionary Computation (CEC), June 2019, pp. 2794–2799.
-  B. Gu, V. S. Sheng, K. Y. Tay, W. Romano, and S. Li, “Incremental support vector learning for ordinal regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 7, pp. 1403–1416, July 2015.
-  M. Jiang, Z. Huang, L. Qiu, W. Huang, and G. Yen, “Transfer learning based dynamic multiobjective optimization algorithms,” IEEE Transactions on Evolutionary Computation, vol. PP, no. 99, pp. 1–1, 2017.
-  M. Jiang, W. Huang, Z. Huang, and G. G. Yen, “Integration of global and local metrics for domain adaptation learning via dimensionality reduction,” IEEE Transactions on Cybernetics, vol. 47, no. 1, pp. 38–51, Jan 2017.
-  J. Lu, V. Behbood, P. Hao, H. Zuo, S. Xue, and G. Zhang, “Transfer learning using computational intelligence: A survey,” Knowledge-Based Systems, vol. 80, pp. 14 – 23, 2015, 25th anniversary of Knowledge-Based Systems.
-  K. Deb, Multi-objective optimization using evolutionary algorithms. John Wiley & Sons, 2001, vol. 16.
-  W. Dai, Q. Yang, G. R. Xue, and Y. Yu, “Boosting for transfer learning,” in International Conference on Machine Learning, 2007.
-  D. Pardoe and P. Stone, “Boosting for regression transfer,” in International Conference on Machine Learning, 2010.
-  D. Basak, S. Pal, and D. C. Patranabis, “Support vector regression,” Neural Information Processing-Letters and Reviews, vol. 11, no. 10, pp. 203–224, 2007.
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,”IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, April 2002.
-  C.-K. Goh and K. C. Tan, “A competitive-cooperative coevolutionary paradigm for dynamic multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 13, no. 1, pp. 103–127, 2009.
-  A. Zhou, Y. Jin, and Q. Zhang, “A population prediction strategy for evolutionary dynamic multiobjective optimization,” IEEE Transactions on Cybernetics, vol. 44, no. 1, pp. 40–53, jan 2014.
-  S. Jiang and S. Yang, “A steady-state and generational evolutionary algorithm for dynamic multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 21, no. 1, pp. 65–82, 2017.
-  M. Farina, K. Deb, and P. Amato, “Dynamic multiobjective optimization problems: test cases, approximations, and applications,” IEEE Transactions on evolutionary computation, vol. 8, no. 5, pp. 425–442, 2004.
-  Q. Zhang, A. Zhou, and Y. Jin, “Rm-meda: A regularity model-based multiobjective estimation of distribution algorithm,” IEEE Transactions on Evolutionary Computation, vol. 12, no. 1, pp. 41–63, 2008.
-  C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp. 27:1–27:27, May 2011.
-  J. Min, C. Zhou, and S. Chen, “Embodied concept formation and reasoning via neural-symbolic integration,” Neurocomputing, vol. 74, no. 1, pp. 113–120, 2010.
W. Yin, J. Min, Z. Huang, C. Fei, and C. Zhou, “An np-complete fragment of
Annals of Mathematics and Artificial Intelligence, vol. 75, no. 3-4, pp. 391–417, 2015.