Abstract

In recent years, a number of recombination operators have been proposed for multiobjective evolutionary algorithms (MOEAs). One kind of recombination operators is designed based on the Gaussian process model. However, this approach only uses one standard Gaussian process model with fixed variance, which may not work well for solving various multiobjective optimization problems (MOPs). To alleviate this problem, this paper introduces a decomposition-based multiobjective evolutionary optimization with adaptive multiple Gaussian process models, aiming to provide a more effective heuristic search for various MOPs. For selecting a more suitable Gaussian process model, an adaptive selection strategy is designed by using the performance enhancements on a number of decomposed subproblems. In this way, our proposed algorithm owns more search patterns and is able to produce more diversified solutions. The performance of our algorithm is validated when solving some well-known F, UF, and WFG test instances, and the experiments confirm that our algorithm shows some superiorities over six competitive MOEAs.

1. Introduction

Multiobjective optimization problems (MOPs) widely exist in the fields of scientific research and engineering applications. Since the first multiobjective evolutionary algorithm (MOEA) was reported in 1985 [1], MOEAs have been studied sufficiently and become one of the hottest research directions in the field of evolutionary computation [24]. Internationally, MOEAs represented by NSGA-II [5], SPEA2 [6], PAES [7], HypE [8], MOEA/D [9], etc., have been widely used in many application fields. According to the selection strategy to handle the convergence enhancement and diversity maintenance, most of existing MOEAs can be roughly divided into the following three categories.

1.1. Pareto-Based MOEAs

The basic idea of such algorithms, represented by NSGA-II and SPEA2, is to use Pareto-based ranking scheme for sorting the population into different convergence layers and then calculate the density of individuals in the last layer. In this way, the population can be sorted according to the dominance relationship and density estimation, and then the relatively superior individuals are selected to the next generation. Crowded distance [5], K-nearest neighbor method [6], ε-domination [10, 11], grading [12], and other methods [1315] are often used to estimate the density of the individuals. As Pareto-based MOEAs have some advantages like simple principle, easy understanding, and fewer parameters, this kind of MOEAs has induced many research and extensive applications. However, their ability to guarantee convergence dramatically degrades when the number of objectives is larger than three, mainly due to the loss of selection pressure.

1.2. Decomposition-Based MOEAs

Decomposition-based MOEAs transform a MOP into a set of subproblems and then solve them simultaneously using a collaborative evolutionary search, such as MOEA/D [9], RVEA [16], NSGA-III [17], RPEA [18], SPEA/R [19], and RdEA [20]. Note that most of these algorithms adopt additional reference information (reference vectors, reference points, or weight vectors) during the environmental selection, which helps to maintain the diversity of the population. Due to the advantage with a better mathematical explanation, decomposition-based MOEAs have become very popular in recent years [2124].

1.3. Indicator-Based MOEAs

Indicator-based MOEAs directly employ a performance indicator, like hypervolume (HV), generational distance (GD), and R2, to effectively guide the selection of promising solutions for next generation. IBEA [25], MOMBI-II [26], HyPE [8], GD-MOEA [27], R2-IBEA [28], and DDE [29] are representatives for the indicator-based MOEAs. In these MOEAs, performance indicators are used as selection criterion to rank nondominated solutions that cannot be distinguished by traditional Pareto dominance. However, for this kind of MOEAs, high computational complexity is usually required for calculating the performance indicator, which is very challenging especially when the number of objectives is large.

These MOEAs usually include two main components, i.e., variation and selection [6]. Selection plays an important role in MOEAs to maintain the promising solutions, as introduced above for classifying MOEAs, while variation is the key factor to determine the generation quality of solutions. In [30], the effects of several variation operators are studied on some test problems with variable linkages, showing that variable linkages may cause some difficulties for MOEAs. Actually, there are a number of real-valued variation approaches having been proposed during the recent decades [31], which can be classified into the three main kinds, i.e., traditional recombination operators, estimation of distribution algorithms, and the inverse model methods for variation.

Traditional recombination operators are generally used in most existing MOEAs due to their simplicity. This kind of variation simulates the binary crossover method to produce the real-valued offspring, such as simulated binary crossover (SBX) [32], Laplace crossover [33], parent central crossover [34], blend crossover [35], unimodal normal distribution crossover [36], and simple crossover [37]. Moreover, differential evolution (DE) [38] is also used as the recombination operator in many MOEAs, which samples offspring based on the difference vectors of parents. The evolution paths are used in DE [39] to depict the population movement and predict its tendency, which could produce potential solutions to speed up convergence toward the PS. Recently, a number of hybridized recombination operators have been proposed, trying to combine the advantages of differential recombination operators. In [40], a cooperative DE framework is designed for constrained MOPs, in which multiple DE operators are run in different subpopulations to optimize its own constrained subproblem. In [41], four DE operators are combined and a sliding window is adopted in [42] to provide the reward for each DE operator according to the enhancement on subproblems. Similarly, four DE operator pools are presented in [43] including two DE operators with complementary search patterns in each pool to provide an improved search ability. In ACGDE [44], an adaptive cross-generation DE operator is designed by exploiting information from individuals across generations to adapt the parameters.

The estimation of distribution algorithms (EDAs) [45] exploit the probabilistic models extracted from the population’s distribution to run variation [46, 47]. Unlike the above traditional recombination operators, no crossover or mutation procedures will be run in EDAs. Instead, the globally statistical information of the selected parents is used in EDAs to build a posterior probability distribution model and then offspring are sampled from this built model. Several EDAs are studied for solving continuous MOPs in [48, 49], while the Gaussian distribution model, mixture Gaussian distribution model, or mixed Gaussian with the principal component analysis (PCA) model are introduced in [50] for offspring variation. Different from the individual’s information used in traditional recombination operators, the local PCA operator is employed in [45, 47] to generate new offspring. Different from [47], the PCA model was replaced by the locally linear embedding (LLE) [51] model in [52]. In this way, this model only considers the decision space, without considering too much about the target MOP itself.

The inverse model methods for variation utilize machine learning approaches to capture the connection from the objective space to the decision space by exploiting the characteristics of the target MOP. The representative algorithms like IM-MOEA [53] and E-IM-MOEA [54] use the Gaussian process-based inverse model to complete crossover. In another work [55] as inspired by LLE manifold learning idea, a new LLE modeling approach is introduced to use the mapping function known in the MOP that the decision space is considered as the high-dimensional space and the objective space is regarded as a low-dimensional space. Thus, this new modeling method is no longer to build the overall low dimensional space of the sample, which is then reflected back to the high dimensional space, but it replaces directly by constructing new samples in high dimensional space. In this way, the model mapping from the objective space to the decision space is built based on the obtained approximated Pareto set during the evolution. In other words, this model can use the probability model or surrogate model to build a bridge from the objective space to the decision space.

In the above three kinds of variation operators, traditional recombination operators are often used in many MOEAs. However, as these operators are executed based on the individuals, they only provide a finite number of search patterns and are also criticized for lack of mathematic explanation. For the inverse model methods for variation, it is very changeling for mapping from the objective space into the decision space when tackling some complicated MOPs. EDAs usually adopt one standard Gaussian process model with fixed variance, which may not work well for various kinds of MOPs. To enhance the search capability of EDAs, this paper introduces a decomposition-based MOEA with adaptive multiple Gaussian process models, called MOEA/D-AMG, which is effective in generating more superior offspring. Based on the performance enhancements on a number of decomposed subproblems, one suitable Gaussian process model will be selected accordingly. In this way, our method is more intelligent and can provide more search patterns to produce the diversified solutions. After evaluating our performance on some well-known F, UF, and WFG test instances, the experimental results have validated the superiorities of our algorithm over six competitive MOEAs (MOEA/D-SBX [9], MOEA/D-DE [56], MOEA/D-GM [57], RM-MEDA [47], IM-MOEA [53], and AMEDA [50]).

The rest of this paper is organized as follows. Section 2 presents some background information of MOPs, Gaussian process model, and some Gaussian process model-based MOEAs. Section 3 introduces the details of MOEA/D-AMG. At last, Section 4 provides and discusses the simulation results, while Section 5 gives the conclusions and some future work.

2. Preliminaries

2.1. Multiobjective Optimization Problems

In real life, there exist a number of engineering problems with multiple complicated optimization objectives, which are often called multiobjective optimizations problems (MOPs). This paper considers to solve continuous MOPs, as formulated below:where n and m are, respectively, the numbers of decision variables and objectives, and are, respectively, the lower and upper bounds of ith dimensional variable of x in the decision space, is a decision variable vector, is the feasible search space, , is a continuous mapping, and consists of m continuous objective functions.

In MOPs, the conflicts often exist among different objectives, i.e., improvement of one objective results in deterioration of another. Generally, there does not exist an optimal solution that can minimize all the objectives in (1) at the same time. A set of trade-off solutions among the objectives can be found for solving MOPs, which are equally optimal when considering all the objectives. Suppose that there are two vectors and , where m is the number of objectives. p is said to dominate q in equation (1), denoted by , if pi ≤ qi for all i = 1, …, m, and . A solution , where is the feasible search space in equation (1), is called Pareto optimal solution if and only if there is no another such that . The collection of all the Pareto optimal solutions is called Pareto optimal set (PS), and the projection of PS in the objective space is called Pareto optimal front (PF).

2.2. Gaussian Process Model

The recombination operator based on the Gaussian process model belongs to EDA. Different from the classical recombination operators, the recombination using Gaussian process model uses the distribution information from the whole population to generate offspring, which can ensure the diversity of search patterns.

The Gaussian model is one of the most widely used probability models in scientific research and practical applications [5861]. In general, a random variable with a Gaussian distribution can be expressed aswhere μ is an n dimensional mean vector and is the covariance matrix. The probability density function of the random variable is expressed as

For a given set of data , the mean vector and covariance matrix are, respectively, obtained by

Hence, a new solution can be generated using the Gaussian model, which can be divided into three steps of Algorithm 1. At first, decompose the covariance matrix into a lower triangular matrix A by using the Cholesky decomposition method [62] in line 1, where . Then, generate a vector in line 2, of which the element , , is sampled from a standard Gaussian distribution . After that, a new trial solution x can be yielded by in line 3. Generally, to improve the search capability of the Gaussian model, a small variation is added to mutate the new solution by the polynomial mutation operator [2].

(1)Obtain a lower triangular matrix A through decomposing the covariance matrix using Cholesky, and
(2)Generate a single Gaussian distribution variable , ,
(3).
2.3. Gaussian Process Model-Based MOEAs

In recent years, several Gaussian process model-based MOEAs have been proposed. Their details are, respectively, introduced in the following paragraphs. At last, the motivation of this paper is clarified.

In [63], a mixture-based multiobjective iterated density estimation evolutionary algorithm (MIDEA) with both discrete and continuous representations is proposed. This approach employs clustering analysis to discover the nonlinear distribution structure, which validates that a simple model is feasible to describe a cluster, and the population can be adequately represented by a mixture of simple models. After that, a mixture of Gaussian models is proposed to produce new solutions for continuous real-valued MOPs in MIEDA.

Although MIEDA is very effective for solving certain MOPs, the regularity property of MOPs is not considered so that it may perform poorly in some problems. Thus, a multiobjective evolutionary algorithm based on decomposition and probability model (MOEA/D-MG) is designed in [57]. In this approach, multivariate Gaussian models are embedded into MOEA/D [9] for continuous multiobjective optimization. Either a local Gaussian distribution model is built around a subproblem based on the neighborhood or a global model is constructed based on the whole population. Thus, the population distribution can be captured by all the probability models working together. However, MOEA/D-MG has to reconstruct a Gaussian model for each subproblem, resulting in a large computational cost in the process of building the model. Moreover, when building the Gaussian model for the similar subproblems, some individuals may be repeatedly used, which aggravates the computational resources.

To reduce the computational cost for the modeling process, an improved MOEA/D-MG with high modeling efficiency (MOEA/D-MG2) is reported in [64], where the neighboring subproblems can share the same covariance matrix to build Gaussian model for sampling solutions. At first, a process of sorting population is used to adjust the sampling orders of the subproblems, which tries to avoid the insufficient diversity caused by the reuse of the covariance matrix and can obtain the uniformly distributed offspring set. Then, in global search, only some subproblems are selected randomly to construct the Gaussian model. Although MOEA/D-MG2 performs well for solving some MOPs, it just builds the Gaussian model for solution sampling in the neighborhoods as defined in MOEA/D, which is only used in the MOEA/D paradigm.

Moreover, an adaptive multiobjective estimation of distribution algorithm with a novel Gaussian sampling strategy (AMEDA) is presented in [50]. In this work, a clustering analysis approach is adopted to reveal the structure of the population distribution. Based on these clusters, a local multivariate Gaussian model or a global model is built for each solution to sample a new solution, which can enhance the accuracy of modeling and the searching ability of AMEDA. Moreover, an adaptive update strategy of the probability is developed to control the contributions for two types of Gaussian model.

From the above studies, it can be observed that their modeling differences mainly focus on the methods of selecting sampling solutions. However, the above Gaussian models in [50, 63, 64] only adopt the standard one with fixed variance, which cannot adaptively adjust the search step sizes according to the different characteristics of MOPs. To alleviate the above problem, a decomposition-based MOEA with adaptive multiple Gaussian process models (called MOEA/D-AMG) is proposed in this paper. Multiple Gaussian models are used in our algorithm by using a set of different variances. Then, based on the performance enhancements of decomposed subproblems, a suitable Gaussian model will be adaptively selected, which helps to enhance the search capability of MOEA/D-AMG and can well handle various kinds of MOPs as validated in the experimental section.

3. The Proposed Algorithm

In this section, our proposed algorithm MOEA/D-AMG is introduced in detail. At first, the adaptive multiple Gaussian process models are described. Then, the details of MOEA/D-AMG are demonstrated.

3.1. Adaptive Multiple Gaussian Process Models

Many approaches have been designed by taking advantages of the regularity in distributions of Pareto optimal solutions in both the decision and objective spaces to estimate the population distribution. Considering the mild conditions, it can be induced from the Karush–Kuhn–Tucker (KTT) condition that the PF is (m-1)-dimensional piecewise continuous manifolds [65] for an m-objective optimization problem. That is to say, the PF of a continuous biobjective MOP is a piecewise continuous curve, while the PF of a continuous three-objective MOP is a piecewise continuous surface. Thus, the Gaussian process model has been widely studied for both single-objective and multiobjective optimization [6672]. However, a single Gaussian process model is not so effective for modeling the population distribution when tackling some complicated MOEAs as studied in [57]. In consequence, multiple Gaussian models are used in this paper, which can explicitly exploit the ability of different Gaussian models with various distributions. By using multiple Gaussian process models with good diversity, a more suitable one should be adaptively selected to capture the population structure for sampling new individuals more accurately. Thus, five types of Gaussian models are used in this paper, which have the same mean value 0 with different standard deviations, i.e., 0.6, 0.8, 1.0, 1.2, and 1.4. The distributions of five Gaussian models (as, respectively, represented by g1, g2, g3, g4, and g5) are plotted in Figure 1.

To select a suitable one from multiple Gaussian process models, an adaptive strategy is proposed in this paper to improve the comprehensive search capability. The probabilities of selecting these different models are dependent on their performance to optimize the subproblems. Once a Gaussian process model is selected, it will be used to generate the Gaussian distribution variable y in line 3 of Algorithm 1. Before the adaptive strategy comes into play, these predefined Gaussian models are selected with an equal probability. In our approach, a set of offspring solutions will be produced by using different Gaussian distribution variables y. After analyzing the quality of these new offspring solutions, the contribution rate of each Gaussian process model can be calculated. It should be noted that the quality of the new offspring is determined by their fitness value, which can be calculated by many available methods. In our work, the Tchebycheff approach in [9] is adopted, as follows:where is a reference point, is the ith weight vector, and m is the number of objectives. A smaller fitness value of new offspring which is generated by Gaussian distribution variable indicates a greater contribution for the corresponding Gaussian model. Hence, for each Gaussian model, the improvement of fitness (IoF) value can be obtained as follows:where Fek,G is the fitness of new offspring with the kth Gaussian distribution in generation G. In order to maximize the effectiveness of the proposed adaptive strategy, this strategy is executed by each of LP generations in the whole process of our algorithm. Afterwards, the contribution rates (Cr) of different Gaussian distributions can be calculated as follows:where ε is a very small value, which works when IoF is zero. Then, the probabilities (Pr) of different distributions being selected are updated by the following formula:where K is the total number of Gaussian models. As described above, we can adaptively adjust the probability that the kth Gaussian distribution is selected by updating the value of Prk,G.

3.2. The Details of MOEA/D-AMG

The above adaptive multiple Gaussian process models are just used as the recombination operator, which can be embedded into a state-of-the-art MOEA based on decomposition (MOEA/D [9]), giving the proposed algorithm MOEA/D-AMG. In this section, the details of the proposed MOEA/D-AMG are introduced.

To clarify the running of MOEA/D-AMG, its pseudocode is given in Algorithm 2 and some used parameters are introduced below:(1)N indicates the number of subproblems and also the size of population.(2)K is the number of Gaussian process models.(3)G is the current generation.(4) is the parameter which is applied to control the balance between exploitation and exploration.(5)gss is the number of randomly selected subproblems for constructing global Gaussian model.(6)LP is the parameter to control the frequency of using the proposed adaptive strategy.(7)nr is the maximal number of old solutions which are allowed to be replaced by a new one.(8)perm(.) randomly shuffles the input values, and rand() produces a random real number in [0, 1].

(1)Initialize N subproblems including weight vector , N individuals, and neighborhood size T
(2)Initialize the reference point , j= 1, 2, …, m
(3)Predefine
(4)while not terminate do
(5)for do
Model construction
(6)  if then
(7)   Define neighborhood individuals
(8)  else
(9)   Select individuals from to construct B
(10)  end
Model updation
(11)  if then
(12)   Update using equations (5)–(8)
(13)  end
(14)  Select a Gaussian model according to
(15)  Generate y based on the selected Gaussian model
(16)  Add B and y into Algorithm 1 to generate a new offspring x
Population updating
(17)  Set counter c= 0
(18)  for each do
(19)   if and then
(20)     is replaced by
(21)    Set c=c+1
(22)   end
(23)  end
(24)end
(25)G=G+ 1
(26) Update the reference point
(27)end
(28)Output the final population

Lines 1-2 of Algorithm 2 describe the initialization process by setting the population size N and generating N weight vectors using the approach in [56] to define N subproblems in (5). Then, an initial population is randomly generated to include N individuals and the neighborhood size is set to select T neighboring subproblems for constructing the Gaussian model. Then, the reference point can be obtained by including all the minimal value of each objective. In line 3, multiple Gaussian distributions are defined, and the corresponding initial probabilities are set to an equal value. Then, a maximum number of generations is used as the termination condition in line 4 and all the subproblems are randomly selected once at each generation in line 5. After that, the main evolutionary loop of the proposed algorithm is run in lines 5–26. For the subproblem selected at each generation, it will go through three components, including model construction, model updation, and population updation as observed from Algorithm 2.

Lines 6–10 run the process of model construction. As controlled by the parameter , a set of subproblems B is either the neighborhood of current subproblem Bi in line 7 or gss subproblems selected randomly in line 9. If the neighborhood of current subproblem is used, the local multiple Gaussian models are constructed for running exploitation; otherwise, the global models are constructed for running exploration.

Lines 11–16 implement the process of model updation. The adaptive strategy described in Section 3.1 is used to update the probability of five different Gaussian models at each of LP generations, as shown in line 12. Then, a Gaussian model can be selected in line 14 and the selected model is used to update y in line 15. At last, Algorithm 1 is performed to generate a new offspring in line 16 with the selected subproblems B for model construction and the generated Gaussian variable y.

Lines 17–23 give the process of population updation. For each individual of B in line 18, if the aggregated function value using equation (5) for subproblem is improved by the new offspring x in line 19, the original individual will be replaced by x in line 20. The c value is increased by 1 in line 21 to prevent premature convergence, such that at most c individuals in B can be replaced.

At last, the generation counter G is added by 1 and the reference point is updated by including the new minimal value for each objective. If the termination in line 4 is not satisfied, the above evolutionary loop in lines 5–26 will be run again; otherwise, the final population will be outputted in line 28.

4. Experimental Studies

4.1. Test Instances

Twenty-eight unconstrained MOP test instances are employed here as the benchmark problems for empirical studies. To be specific, UF1-UF10 are used as the benchmark problems in CEC2009 MOEA competition and F1–F9 are proposed in [48]. These test instances have complicated PS shapes. We also consider WFG test suite [49] with different problem characteristics, including no separable, deceptive, degenerate problems, mixed PF shape, and variable dependencies. The number of decision variables is set to 30 for UF1-UF10 and F1–F9; for WFG1-WFG9, the numbers of position and distance-related decision variable are, respectively, set to 2 and 4, while the number of objectives is set to 2.

4.2. Performance Metrics
4.2.1. The Inverted Generational Distance (IGD) [73]

Here, assume that is a set of solutions uniformly sampled from the true PF and S represents a solution set obtained by a MOEA [73]. The IGD value from to S will calculate the average distance from each point of to the nearest solution of S in the objective space, as follows:where dist (x, S) returns the minimal distance from one solution x in to one solution in S and returns the size of .

4.2.2. Hypervolume (HV) [74]

Here, assume that a reference point in the objective space is dominated by all Pareto-optimal objective vectors [74]. Then, the HV metric will measure the size of the objective space dominated by the solutions in S and bounded by Zr, as follows:where indicates the Lebesgue measure. In our empirical studies, and are, respectively, set for the biobjective and three-objective cases in UF and F test problems. Due to the different scaled values in different objectives, is set for WFG test suite.

Both IGD and HV metrics can reflect the convergence and diversity for the solution set S simultaneously. The lower IGD value (or the larger HV value) indicates the better quality of S for approximating the entire true PF. In this paper, six competitive MOEAs, including MOEA/D-SBX [9], MOEA/D-DE [56], MOEA/D-GM [57], RM-MEDA [47], IM-MOEA [53], and AMEDA [50], are included to validate the performance of our proposed MOEA/D-AMG. All the comparison results obtained by these MOEAs regarding IGD and HV are presented in the corresponding tables, where the best mean metric values are highlighted in bold and italics. In order to have statistically sound conclusions, Wilcoxon’s rank sum test at a 5% significance level is conducted to compare the significance of difference between MOEA/D-AMG and each compared algorithm.

4.3. The Parameter Settings

The parameters of MOEA/D-SBX, MOEA/D-DE, MOEA/D-GM, RM-MEDA, IM-MOEA, and AMEDA are, respectively, set according to references [9, 47, 53, 56, 57, 50]. All these MOEAs are implemented in MATLAB. The detailed parameter settings of all the algorithms are summarized as follows.

4.3.1. Public Parameters

For population size N, we set N= 300 for biobjective problems and N= 600 for three-objective problems. For number of runs and termination condition, each algorithm is independently launched by 20 times on each test instance, and the termination condition of an algorithm is the predefined maximal number of function evaluations, which is set to 300000 for UF instances, 150000 for F instances, and 200000 for WFG instances. We set the neighborhood size T= 20 and the mutation probability , where n is the number of decision variables for each test problem and its distribution index is .

4.3.2. Parameters in MOEA/D-SBX

The crossover probability and distribution index are, respectively, set to 0.9 and 30.

4.3.3. Parameters in MOEA/D-DE

The crossover rate CR = 1 and the scaling factor F= 0.5 in DE as recommended in [9], the maximal number of solution replacement nr = 2, and the probability of selecting the neighboring subproblems .

4.3.4. Parameters in MOEA/D-GM

The neighborhood size for each subproblem K= 15 for biobjective problems and K= 30 for triobjective problems (this neighborhood size K is crucial for generating offspring and updating parents in evolutionary process), the parameter to balance the exploitation and exploration pn= 0.8, and the maximal number of old solutions which are allowed to be replaced by a new one C= 2.

4.3.5. Parameters in RM-MEDA

The number of clusters K is set to 5 (this value indicates the number of disjoint clusters obtained by using local principal component analysis on population) and the maximum number of iterations in local PCA algorithm is set to 50.

4.3.6. Parameters in IM-MOEA

The number of reference vectors K is set to 10 and the model group size L is set to 3.

4.3.7. Parameters in AMEDA

The initial control probability β0= 0.9, the history length H= 10, and the maximum number of clusters K= 5 (this value indicates the maximum number of local clusters obtained by using hierarchical clustering analysis approach on population K).

4.3.8. Parameters in MOEA/D-AMG

The number of Gaussian process models K= 5, the initial Gaussian distribution is (0, 0.62), (0, 0.82), (0, 1.02), (0, 1.22), and (0, 1.42), LP= 10, nr = 2, , and gss = 40.

4.4. Comparison Results and Discussion

In this section, the proposed MOEA/D-AMG is compared with six competitive MOEAs, including MOEA/D-SBX, MOEA/D-DE, MOEA/D-GM, RM-MEDA, IM-MOEA, and AMEDA. The results obtained by these algorithms on the 28 test problems are given in Tables 1 and 2, regarding the IGD and HV metrics, respectively. When compared to other algorithms, the performance of MOEA/D-AMG has a significant improvement when the multiple Gaussian process models are adaptively used. It is observed that the proposed algorithm can improve the performance of MOEA/D in most of the test problems. Table 1 summarizes the statistical results in terms of IGD values obtained by these compared algorithms, where the best result of each test instance is highlighted. The Wilcoxon rank sum test is also adopted at a significance level of 0.05, where symbols “+,” “−,” and “∼” indicate that results obtained by other algorithms are significantly better, significantly worse, and no difference to that obtained by our algorithm MOEA/D-AMG. To be specific, MOEA/D-AMG shows the best results on 12 out of the 28 test instances, while the other compared algorithms achieve the best results on 3, 3, 5, 2, 2, and 1 out of the 28 problems in Table 1. When compared with MOEA/D-SBX, MOEA/D-DE, MOEA/D-GM, RM-MEDA, IM-MOEA, and AMEDA, MOEA/D-AMG yields 23, 21, 18, 22, 24, and 25 significantly better mean IGD metric values, respectively. When compared to RM-MEDA, IM-MOEA, and AMEDA, MOEA/D-AMG has shown an absolute advantage, as it outperforms them on 22, 24, and 25 test problems regarding IGD values, respectively, whereas RM-MEDA outperforms MOEA/D-AMG on F8, UF3, WFG2, and WFG6, IM-MOEA outperforms MOEA/D-AMG on F1, UF10, WFG4, and WFG6, and AMEDA outperforms MOEA/D-AMG on WFG2, WFG4, and WFG6 when considering IGD values. Regarding the comparisons to MOEA/D-SBX and MOEA/D-DE, MOEA/D-AMG also shows some advantages, as it outperforms them on at least 21 test problems. Except for UF4-UF5, UF10, WFG1, and WFG4-WFG5, MOEA/D-AMG performs better on most cases regarding the IGD values. Since a multivariate Gaussian model is used in MOEA/D-GM to capture the population distribution from a global view, MOEA/D-GM can give the best IGD values on 7 cases of test instances. However, MOEA/D-AMG performs better than MOEA/D-AMG on 18 test problems. These statistical results about IGD indicate MOEA/D-AMG has the better optimization performance when compared with other algorithms on most of the test instances.

For HV metric shown in Table 2, the experimental results also demonstrate the superiority of MOEA/D-AMG on these test problems. It is clear from Table 2 that MOEA/D-AMG obtains the best results in 13 out of 28 cases, while MOEA/D-SBX, MOEA/D-DE, MOEA/D-GM, RM-MEDA, IM-MOEA, and AMEDA preform, respectively, best in 2, 3, 5, 1, 2, and 2 cases. As observed from Table 2, considering the F instances, except for MOEA/D-GM, MOEA/D-AMG is only beaten by IM-MOEA on F1, while MOEA-GM is a little better than MOEA/D-AMG on F6–F8. On UF problems, MOEA/D-AMG obtains significantly better results than MOEA/D-SBX and RM-MEDA on all cases in terms of HV. MOEA/D-DE performs better on UF4 and UF5, and MOEA/D-GM performs better on UF3-UF5 and UF7. When compared to IM-MEDA and AMEDA, they only perform better than MOEA/D-AMG on UF10 and UF6, respectively. For WFG instances, MOEA/D-DE and RM-MEDA only obtain statistically similar results with MOEA/D-AMG on WFG5 and WFG9, respectively. MOEA/D-AMG significantly outperforms MOEA/D-GM on WFG1-WFG9. Expect for WFG1 and WFG4, MOEA/D-AMG shows the superior performance on most WFG problems. AMEDA only has the better HV result on WFG6. Therefore, it is reasonable to draw a conclusion that the proposed MOEA/D-AMG presents a superior performance over each compared algorithm when considering all the test problems.

Moreover, we also use the R2 indicator to further show the superior performance of MOEA/D-AMG, and the similar conclusions can be obtained from Table 3.

To examine the convergence speed of the seven algorithms, the mean IGD metric values versus the fitness evaluation numbers for all the compared algorithms over 20 independent runs are plotted in Figures 24, respectively, for some representative problems from F, UF, and WFG test suites. It can be observed from these figures that the curves of the mean IGD values obtained by MOEA/D-AMG reach the lowest positions with the fastest searching speed on most cases, including F2–F5, F9, UF1-UF2, UF8-UF9, WFG3, and WFG7-WFG8. Even for F1, F6–F7, UF6-UF7, and WFG9, MOEA/D-AMG achieves the second lowest mean IGD values in our experiment. The promising convergence speed of the proposed MOEA/D-AMG might be attributed to the adaptive strategy used in the multiple Gaussian process models.

To observe the final PFs, Figures 5 and 6 present the final nondominated fronts with the median IGD values found by each algorithm over 20 independent runs on F4–F5, F9, UF2, and UF8-UF9. Figure 5 shows that the final solutions of F4 yielded by RM-MEDA, IM-MOEA, and AMEDA do not reach the PF, while MOEA/D-GM and MOEA/D-AMG have good approximations to the true PF. Nevertheless, the solutions achieved by MOEA/D-AMG on the right end of the nondominated front have better convergence than those achieved by MOEA/D-GM. In Figure 5, it seems that F5 is a hard instance for all compared algorithms. This might be due to the fact that the optimal solutions to two neighboring subproblems are not very close to each other, resulting in little sense to mate among the solutions to these neighboring problems. Therefore, the final nondominated fronts of F5 are not uniformly distributed over the PF, especially in the right end. However, the proposed MOEA/D-AMG outperforms other algorithms on F5 in terms of both convergence and diversity. For F9 that is plotted in Figure 5, except for MOEA/D-GM and MOEA/D-AMG, other algorithms show the poor performance to search solutions which can approximate the true PF. With respect to UF problems in Figure 6, the final solutions with median IGD obtained by MOEA/D-AMG have better convergence and uniformly spread on the whole PF, when compared with the solutions obtained by other compared algorithms. These visual comparison results reveal that MOEA/D-AMG has much more stable performance to generate satisfactory final solutions with better convergence and diversity for the test instances.

Based on the above analysis, it can be concluded that MOEA/D-AMG performs better than MOEA/D-SBX, MOEA/D-DE, MOEA/D-GM, RM-MEDA, IM-MOEA, and AMEDA when considering all the F, UF, and WFG instances, which is attributed to the use of adaptive strategy to select a more suitable Gaussian model for running recombination. There may be different causes to let MOEA/D-SBX, MOEA/D-DE, MOEA/D-GM, RM-MEDA, IM-MOEA, and AMEDA perform not so well on the test suites. For RM-MEDA and AMEDA, they select individuals to construct the model by using the clustering method, which cannot show good diversity as that in the MOEA/D framework. Both algorithms using the EDA method firstly classify the individuals and then use all the individuals in the same class as samples for training model. Implicitly, an individual in the same class can be considered as a neighboring individual. However, on the boundary of each class, there may exist individuals that are far away from each other, and there may exist individuals from other classes that are closer to this boundary individual, which may result in the fact that their model construction is not so accurate. For IM-MOEA that uses reference vectors to partition the objective space, its training data are too small when dealing with the problems with low decision space, like some test instances used in our empirical studies.

4.5. Effectiveness of the Proposed Adaptive Strategy

According to the experimental results in the previous section, MOEA/D-AMG with the adaptive strategy shows great advantages when compared with other algorithms. In order to illustrate the effectiveness of the proposed adaptive strategy, we design some experiments for in-depth analysis.

In this section, an algorithm called MOEA/D-FMG is used to run this comparison experiment. Note that the difference between MOEA/D-FMG and MOEA/D-AMG lies in that MOEA/D-FMG adopts a fixed Gaussian distribution to control the generation of new offspring without adaptive strategy during the whole evolution. Here, MOEA/D-FMG with five different distributions (0, 0.62), (0, 0.82), (0, 1.02), (0, 1.22), and (0, 1.42) are conducted to solve all the F, UF, and WFG instances. To have a fair comparison, other components of MOEA/D-FMG are kept the same as that in MOEA/D-AMG. The statistical results of the IGD metric values obtained by MOEA/D-AMG and MOEA/D-FMG with different distributions over 20 independent runs are presented in Table 4 for the used test problems.

As observed from Table 4, MOEA/D-AMG obtains the best mean IGD values on 17 out of 28 cases, while all the MOEA/D-FMG variants totally achieve the best mean IGD values on the remaining 11 cases. More specifically, for five MOEA/D-FMG variants with the distributions (0, 0.62), (0, 0.82), (0, 1.02), (0, 1.22), and (0, 1.42), they, respectively, obtain the best HV results on 4, 1, 3, 2, and 1 out of 28 comparisons. As observed from the one-to-one comparisons in the last row of Table 4, MOEA/D-AMG performs better than five MOEA/D-FMG variants with the distributions (0, 0.62), (0, 0.82), (0, 1.02), (0, 1.22), and (0, 1.42), respectively, on 21, 21, 21, 22, and 25 out of 28 comparisons whereas MOEA/D-AMG is only beaten by these five MOEA/D-FMG variants on 4, 4, 6, 6, and 3 comparisons, respectively. Therefore, it can be concluded from Table 4 that the proposed adaptive strategy for selecting a suitable Gaussian model significantly contributes to the superior performance of MOEA/D-AMG on solving MOPs.

The above analysis only considers five different distributions with variance from 0.62 to 1.42. The performance with a Gaussian distribution with bigger variances is further studied here. Thus, several variants of MOEA/D-FMG with three fixed (0, 1.62), (1, 1.82), and (0, 2.02) are used for comparison. Table 5 lists the IGD comparative results on all instances adopted. As observed from the results, MOEA/D-AMG also shows the obvious advantages, as it performs best on most of the comparisons, i.e., in 24 out of 28 cases for IGD results, while other three competitors are not able to obtain a good performance. In addition, we can also observe that with the increase of variances used in competitors, the performance becomes worse. The reason for this observation may be that if we adopt a too larger variance in the whole evolutionary process, the offspring will become irregular leading to the poor convergence. Therefore, it is not surprising that the performance with bigger variances will degrade. It should be noted that the results from Table 5 also explain why we only choose Gaussian distributions from (0, 0.62) to (0, 1.42) for MOEA/D-AMG.

4.6. Parameter Analysis

In the above comparison, our algorithm MOEA/D-AMG is initialized with K = 5, including five types of Gaussian models, i.e., (0, 0.62), (0, 0.82), (0, 1.02), (0, 1.22), and (0, 1.42). In this section, we test the proposed algorithm using different K values (3, 5, 7, and 10), which will use K kinds of Gaussian distributions. To clarify the experimental setting, the standard deviations of Gaussian models adopted by each K value are listed in Table 6. As shown by the obtained IGD results presented in Table 7, it is clear that K = 5 is a preferable number of Gaussian models for MOEA/D-AMG since it achieves the best IGD results in 16 of 28 cases. It also can be found that as K increases, the average IGD values also increase for most of test problems. This is due to the fact that in our algorithm MOEA/D-AMG, the population size is often relatively small, which makes little sense to set a large K value. Thus, a large K value has little effect on improving the performance of the algorithm. However, K = 3 seems to be too small, which is not able to take full advantage of the proposed adaptive multiple Gaussian process models. According to the above analysis, we recommend that K = 5.

5. Conclusions and Future Work

In this paper, a decomposition-based multiobjective evolutionary algorithm with adaptive multiple Gaussian process models (called MOEA/D-AMG) has been proposed for solving MOPs. Multiple Gaussian process models are used in MOEA/D-AMG, which can help to solve various kinds of MOPs. In order to enhance the search capability, an adaptive strategy is developed to select a more suitable Gaussian process model, which is determined based on the contributions to the optimization performance for all the decomposed subproblems. To investigate the performance of MOEA/D-AMG, twenty-eight test MOPs with complicated PF shapes are adopted, and the experiments show that MOEA/D-AMG has superior advantages on most cases when compared to six competitive MOEAs. In addition, other experiments also have verified the effectiveness of the used adaptive strategy to significantly improve the performance of MOEA/D-AMG.

When comparing to generic recombination operators and other Gaussian process-based recombination operators, our proposed method based on multiple Gaussian process models is effective to solve most of the test problems. However, in MOEA/D-AMG, the number of Gaussian models built in each generation is the same as the size of subproblems, which still needs a lot of computational cost. In our future work, we will try to enhance the computational efficiency of the proposed MOEA/D-AMG, and it is also interesting to study the potential application of MOEA/D-AMG in solving some many-objective optimization problems and engineering problems.

Data Availability

The source code and data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China under grant nos. 61876110, 61836005, and 61672358, Joint Funds of the National Natural Science Foundation of China under Key Program Grant U1713212, Natural Science Foundation of Guangdong Province under grant no. 2017A030313338, and Shenzhen Technology Plan under grant no. JCYJ20170817102218122. This study was also supported by the National Engineering Laboratory for Big Data System Computing Technology and the Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University.