Alternative Restart Strategies for CMA-ES

by   Ilya Loshchilov, et al.

This paper focuses on the restart strategy of CMA-ES on multi-modal functions. A first alternative strategy proceeds by decreasing the initial step-size of the mutation while doubling the population size at each restart. A second strategy adaptively allocates the computational budget among the restart settings in the BIPOP scheme. Both restart strategies are validated on the BBOB benchmark; their generality is also demonstrated on an independent real-world problem suite related to spacecraft trajectory optimization.



page 4

page 12

page 14


On evolutionary selection of blackjack strategies

We apply the approach of evolutionary programming to the problem of opti...

Diversity Enhancement for Micro-Differential Evolution

The differential evolution (DE) algorithm suffers from high computationa...

Cumulative Step-size Adaptation on Linear Functions

The CSA-ES is an Evolution Strategy with Cumulative Step size Adaptation...

Algorithm for Evolutionarily Stable Strategies Against Pure Mutations

Evolutionarily stable strategy (ESS) is an important solution concept in...

Optimization of mixing strategy in microalgal raceway ponds

This paper focuses on mixing strategies to enhance the growth rate in an...

Mutant reduction evaluation: what is there and what is missing?

Background. Many mutation reduction strategies, which aim to reduce the ...

Semi-steady-state Jaya Algorithm

The Jaya algorithm is arguably one of the fastest-emerging metaheuristic...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The long tradition of performance of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm on real-world problems (with over 100 published applications [6]) is due among others to its good behavior on multi-modal functions. Two versions of CMA-ES with restarts have been proposed to handle multi-modal functions: IPOP-CMA-ES [2] was ranked first on the continuous optimization benchmark at CEC 2005 [4, 3]; and BIPOP-CMA-ES [5] showed the best results together with IPOP-CMA-ES on the black-box optimization benchmark (BBOB) in 2009 and 2010.

This paper focuses on analyzing and improving the restart strategy of CMA-ES, viewed as a noisy hyper-parameter optimization problem in a 2D space (population size, initial step-size). Two restart strategies are defined. The first one, NIPOP-aCMA-ES (New IPOP-aCMA-ES), differs from IPOP-CMA-ES as it simultaneously increases the population size and decreases the step size. The second one, NBIPOP-aCMA-ES, allocates computational power to different restart settings depending on their current results. While these strategies have been designed with the BBOB benchmarks in mind [8], their generality is shown on a suite of real-world problems [16].

The paper is organized as follows. After describing the weighted active -CMA-ES and its current restart strategies (section 2), the proposed restart schemes are described in section 3. Section 4 reports on their experimental validation. The paper concludes with a discussion and some perspectives for further research.

2 The Weighted Active -Cma-Es

The CMA-ES algorithm is a stochastic optimizer, searching the continuous space by sampling

candidate solutions from a multivariate normal distribution

[10, 9]. It exploits the best solutions out of the

ones to adaptively estimate the local covariance matrix of the objective function, in order to increase the probability of successful samples in the next iteration. The information about the remaining (worst

) solutions is used only implicitly during the selection process.

In active (

)-CMA-ES however, it has been shown that the worst solutions can be exploited to reduce the variance of the mutation distribution in unpromising directions

[12], yielding a performance gain of a factor 2 for the active ()-CMA-ES with no loss of performance on any of tested functions. A recent extension of the -CMA-ES, weighted active CMA-ES [11] (referred to as aCMA-ES for brevity) shows comparable improvements on a set of noiseless and noisy functions from the BBOB benchmark suite [7]. In counterpart, aCMA-ES no longer guarantees the covariance matrix to be positive definite, possibly resulting in algorithmic instability. The instability issues can however be numerically controlled during the search; as a matter of fact they are never observed on the BBOB benchmark suite.

At iteration , -CMA-ES samples individuals according to



denotes a normally distributed random vector with mean

m and covariance matrix C.

These individuals are evaluated and ranked, where index denotes the -th best individual after the objective function. The mean of the distribution is updated and set to the weighted sum of the best individuals (, with for and ).

The active CMA-ES only differs from the original CMA-ES in the adaptation of the covariance matrix . Like for CMA-ES, the covariance matrix is computed from the best solutions, . The main novelty is to exploit the worst solutions to compute , where . The covariance matrix estimation of these worst solutions is used to decrease the variance of the mutation distribution along these directions:

where is adapted along the evolution path and coefficients , , and are defined such that . The interested reader is referred to [10, 11] for a more detailed description of these algorithms.

As mentioned, CMA-ES has been extended with restart strategies to accommodate multi-modal fitness landscapes, and to specifically handle objective functions with many local optima. As observed by [9], the probability of reaching the optimum (and the overall number of function evaluations needed to do so) is very sensitive to the population size. The default population size has been tuned for uni-modal functions; it is hardly large enough for multi-modal functions. Accordingly, [2] proposed a “doubling trick” restart strategy to enforce global search: the restart -CMA-ES with increasing population, called IPOP-CMA-ES, is a multi-restart strategy where the population size of the run is doubled in each restart until meeting a stopping criterion.

The BIPOP-CMA-ES instead considers two restart regimes. The first one, which corresponds to IPOP-CMA-ES, doubles the population size in each restart and uses a fixed initial step-size .
The second regime uses a small population size and initial step-size , which are randomly drawn in each restart as:



stands for the uniform distribution in

. Population size thus varies . BIPOP-CMA-ES launches the first run with default population size and initial step-size. In each restart, it selects the restart regime with less function evaluations. Clearly, the second regime consumes less function evaluations than the doubling regime; it is therefore launched more often.

3 Alternative Restart Strategies

3.1 Preliminary Analysis

The restart strategies of IPOP- and BIPOP-CMA-ES are viewed as a search in the hyper-parameter space.

IPOP-CMA-ES only aims at adjusting population size . It is motivated by the results observed on multi-modal problems [9], suggesting that the population size must be sufficiently large to handle problems with global structure. In such cases, a large population size is needed to uncover this global structure and to lead the algorithm to discover the global optimum. IPOP-CMA-ES thus increases the population size in each restart, irrespective of the results observed so far; at each restart, it launches a new CMA-ES with population size (see on Fig. 1). Factor must be not too large to avoid ”overjumping” some possibly optimal population size ; it must also be not too small in order to reach in a reasonable number of restarts. The use of the doubling trick () guarantees that the loss in terms of function evaluations (compared to the “oracle“ restart strategy which would directly set the population size to the optimal value ) is about a factor of 2.

Figure 1: Restart performances in the 2D hyper-parameter space (population size and initial mutation step size in log. coordinates). For each objective function (20 dimensional Rastrigin - top-left, Gallagher 21 peaks - top-right, Katsuuras - bottom-left and Lunacek bi-Rastrigin bottom-right), the median best function value out of 15 runs is indicated. Legends indicate that the optimum up to precision is found always (), sometimes () or never (). Black regions are better than white ones.

On the Rastrigin 20-D function, IPOP-CMA-ES performs well and always finds the optimum after about 5 restarts (Fig. 1, top-left). The Rastrigin function displays indeed a global structure where the optimum is the minimizer of this structure. For such functions, IPOP-CMA-ES certainly is the method of choice. For some other functions such as the Gallagher function, there is no such global structure; increasing the population size does not improve the results. On Katsuuras and Lunacek bi-Rastrigin functions, the optimum can only be found with small initial step-size (lesser than the default one); this explains why it can be solved by BIPOP-CMA-ES, sampling the two-dimensional () space.

Actually, the optimization of a multi-modal function by CMA-ES with restarts can be viewed as the optimization of the function , which returns the optimum found by CMA-ES defined by the hyper-parameters =(). Function , graphically depicted in Fig. 1 can be viewed as a black box, computationally expensive and stochastic function (reflecting the stochasticity of CMA-ES). Both IPOP-CMA-ES and BIPOP-CMA-ES are based on implicit assumptions about the : IPOP-CMA-ES achieves a deterministic uni-dimensional trajectory, and BIPOP-CMA-ES randomly samples the 2-dimensional search space.

Function also can be viewed as a multi-objective fitness, since in addition to the solution found by CMA-ES, could return the number of function evaluations needed to find that solution. could also return the computational effort SP1 (i.e. the average number of function evaluations of all successful runs, divided by proportion of successful runs). However, SP1 can only be known for benchmark problems where the optimum is known; as the empirical optimum is used in lieu of true optimum, SP1 can only be computed a posteriori.

3.2 Algorithm

Figure 2: An illustration of and hyper-parameters distribution for 9 restarts of IPOP-aCMA-ES (), BIPOP-aCMA-ES ( and for 10 runs), NIPOP-aCMA-ES () and NBIPOP-aCMA-ES ( and many for , ). The first run of all algorithms corresponds to the point with , .

Two new restart strategies for CMA-ES, respectively referred to as NIPOP-aCMA-ES and NBIPOP-aCMA-ES, are presented in this paper.

If the restart strategy is restricted to the case of increasing of population size (IPOP), we propose to use NIPOP-aCMA-ES, where we additionally decrease the initial step-size by some factor . The rationale behind this approach is that the CMA-ES with relatively small initial step-size is able to explore small basins of attraction (see Katsuuras and Lunacek bi-Rastrigin functions on Fig. 1), while with initially large step-size and population size it will neglect the local structure of the function, but converge to the minimizer of the global structure. Moreover, initially, relatively small step-size will quickly increase if it makes sense, and this will allow the algorithm to recover the same global search properties than with initially large step-size (see Rastrigin function on Fig. 1).

NIPOP-CMA-ES thus explores the two-dimensional hyper-parameter space in a deterministic way (see symbols on Fig. 2). For used in this study, NIPOP-CMA-ES thus reaches the lower bound () used by BIPOP-CMA-ES after 9 restarts, expectedly reaching the same performance as BIPOP-CMA-ES albeit it uses only a large population.

The second restart strategy, NBIPOP-aCMA-ES, addresses the case where the probability to find the global optimum does not much vary in the space. Under this assumption, it makes sense to have many restarts for a fixed budget (number of function evaluations). Specifically, NBIPOP-aCMA-ES implements the competition of the NIPOP-aCMA-ES strategy (increasing and decreasing initial in each restart) and a uniform sampling of the space, where is set to and The selection between the two (NIPOP-aCMA-ES and the uniform sampling) depends on the allowed budget like in NBIPOP-aCMA-ES. The difference is that NBIPOP-aCMA-ES adaptively sets the budget allowed to each restart strategy, where the restart strategy leading to the overall best solution found so far is allowed twice () a budget compared to the other strategy.

4 Experimental Validation

The experimental validation of NIPOP-aCMA-ES and NBIPOP-aCMA-ES investigates the performance of the approach comparatively to IPOP-aCMA-ES and BIPOP-aCMA-ES on BBOB noiseless problems and one black-box real-world problem related to spacecraft trajectory optimization. The default parameters of CMA-ES [11, 5] are used. This section also presents the first experimental study of BIPOP-aCMA-ES222For the sake of reproducibility, the source code for NIPOP-aCMA-ES and NBIPOP-aCMA-ES is available at, the active version of BIPOP-CMA-ES [5].

4.1 Benchmarking with BBOB Framework

The BBOB framework [7] is made of 24 noiseless and 30 noisy functions [8]. Only the noiseless case has been considered here. Furthermore, only the 12 multi-modal functions among these 24 noiseless functions are of interest for this study, as CMA-ES can solve the 12 other functions without any restart.

With same experimental methodology as in [7], the results obtained on these benchmark functions are presented in Fig. 4 and Table 1. The results are given for dimension , because the differences are larger in higher dimensions. The expected running time (ERT), used in the figures and table, depends on a given target function value, . It is computed over all relevant trials as the number of function evaluations required in order to reach , summed over all 15 trials, and divided by the number of trials that actually reached  [7].

NIPOP-aCMA-ES. On 6 out of 12 test functions (,,,,,) NIPOP-aCMA-ES obtains the best known results for BBOB-2009 and BBOB-2010 workshops. On Katsuuras and Lunacek bi-Rastrigin, NIPOP-aCMA-ES has a speedup of a factor from 2 to 3, as could have been expected. It performs unexpectedly well on Weierstrass functions, 7 times faster than IPOP-aCMA-ES and almost 3 times faster than BIPOP-aCMA-ES. Overall, according to Fig. 4, NIPOP-aCMA-ES performs as well as BIPOP-aCMA-ES, while restricted to only one regime of increasing population size.

NBIPOP-aCMA-ES. Thanks to the first regime of increasing population size, NBIPOP-aCMA-ES inherits some results of NIPOP-aCMA-ES. However, on functions where the population size does not play any important role, it performs significantly better than BIPOP-aCMA-ES. This is the case for Gallagher 101 peaks and Gallagher 21 peaks functions, where NBIPOP-aCMA-ES has a speedup of a factor of 6. It seems that the adaptive choice between two regimes works efficiently on all functions except on Weierstrass. In this last case, NBIPOP-aCMA-ES mistakingly prefers small populations, with a loss factor 4 compared to NIPOP-aCMA-ES. According to Fig. 4, NBIPOP-aCMA-ES performs better than BIPOP-aCMA-ES on weakly structured multi-modal functions, showing overall best results for BBOB-2009 and BBOB-2010 workshops in dimensions 20 (results not shown here) and 40.

Due to space limitations, the interested reader is referred to [13] for a detailed presentation of the results.

4.2 Interplanetary Trajectory Optimization

The NIPOP-aCMA-ES and NBIPOP-aCMA-ES strategies, designed for the BBOB benchmark functions, can possibly overfit this benchmark suite. In order to test the generality of these strategies, a real-world black-box problem is considered, pertaining to a completely different domain: Advanced Concepts Team of European Space Agency is making available several difficult spacecraft trajectory optimization problems as black box functions to invite the operational research community to compare different derivative-free solvers on these test problems [16].

The following results consider the 18-dimensional bound-constrained black- box function ”TandEM-Atlas501”, that defines an interplanetary trajectory to Saturn from the Earth with multiple fly-bys, launched by the rocket Atlas 501. The final goal is to maximize the mass , which can be delivered to Saturn using one of 24 possible fly-by sequences with possible maneuvers around Venus, Mars and Jupiter.

The first best results was found for a sequence Earth-Venus-Earth-Earth-Saturn () in 2008 by B. Addis et al. [1]. The best results so far () was found in 2011 by G. Stracquadanio et al. [15].

All versions of CMA-ES with restarts have been launched with a maximum budget of function evaluations. All variables are normalized in the range . In the case of sampling outside of boundaries, the fitness is penalized and becomes , where is the closest feasible point from point and is a penalty factor, which was arbitrarily set to .

As shown on Fig. 3, the new restart strategies NIPOP-aCMA-ES and NBIPOP-aCMA-ES respectively improve on the former ones (IPOP-aCMA-ES and BIPOP-aCMA-ES); further, NIPOP-aCMA-ES reaches same performances as BIPOP-aCMA-ES.

The best solution found by NBIPOP-aCMA-ES 333 [0.83521, 0.45092, 0.50284, 0.65291, 0.61389, 0.75773, 0.43376, 1, 0.89512, 0.77264, 0.11229, 0.20774, 0.018255, 6.2057e-09, 4.0371e-08, 0.2028, 0.36272, 0.32442]; fitness(x) = mass(x) = 1546.5

improves on the best solution found in 2008, while it is worse than the current best solution, which is blamed on the lack of problem specific heuristics

[1, 15], on the possibly insufficient time budget ( fitness evaluations), and also on the lack of appropriate constraint handling heuristics.

Figure 3: Comparison of all CMA-ES restart strategies on the Tandem fitness function (mass): median (left) and best (right) values out of 30 runs.

5 Conclusion and Perspectives

This paper contribution regards two new restart strategies for CMA-ES. NIPOP-aCMA-ES is a deterministic strategy simultaneously increasing the population size and decreasing the initial step-size of the Gaussian mutation. NBIPOP-aCMA-ES implements a competition between NIPOP-aCMA-ES and a random sampling of the initial mutation step-size, adaptively adjusting the computational budget of each one depending on their current best results. Besides the extensive validation of NIPOP-aCMA-ES and NBIPOP-aCMA-ES on the BBOB benchmark, the generality of these strategies has been tested on a new problem, related to interplanetary spacecraft trajectory planning.

The main limitation of the proposed restart strategies is to quasi implement a deterministic trajectory in the space. Further work will consider as yet another expensive noisy black-box function, and the use of a CMA-ES in the hyper-parameter space will be studied. The critical issue is naturally to keep the overall number of fitness evaluations beyond reasonable limits. A surrogate-based approach will be investigated [14], learning and exploiting an estimate of the (noisy and stochastic) function.


  • [1] B. Addis, A. Cassioli, M. Locatelli, , and F. Schoen. Global optimization for the design of space trajectories. Optimization On Line, 11, 2008.
  • [2] A. Auger and N. Hansen. A restart CMA evolution strategy with increasing population size. In

    Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2005)

    , pages 1769–1776. IEEE Press, 2005.
  • [3] S. García, D. Molina, M. Lozano, and F. Herrera.

    A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 special session on real parameter optimization.

    Journal of Heuristics, 15:617–644, 2009.
  • [4] N. Hansen. Compilation of results on the 2005 CEC benchmark function set. Online, May 2006.
  • [5] N. Hansen. Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed. In F. Rothlauf, editor, GECCO Companion, pages 2389–2396. ACM, 2009.
  • [6] N. Hansen. References to CMA-ES applications. hansen/cmaapplications.pdf, 2009.
  • [7] N. Hansen, A. Auger, S. Finck, and R. Ros. Real-parameter black-box optimization benchmarking 2012: Experimental setup. Technical report, INRIA, 2012.
  • [8] N. Hansen, S. Finck, R. Ros, and A. Auger. Real-parameter black-box optimization benchmarking 2009: Noiseless functions definitions. Technical Report RR-6829, INRIA, 2009. Updated February 2010.
  • [9] N. Hansen and S. Kern. Evaluating the cma evolution strategy on multimodal test functions. In PPSN’04, pages 282–291, 2004.
  • [10] N. Hansen and A. Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2):159–195, 2001.
  • [11] N. Hansen and R. Ros. Benchmarking a weighted negative covariance matrix update on the BBOB-2010 noiseless testbed. In GECCO ’10: Proceedings of the 12th annual conference comp on Genetic and evolutionary computation, pages 1673–1680, New York, NY, USA, 2010. ACM.
  • [12] G. A. Jastrebski and D. V. Arnold. Improving evolution strategies through active covariance matrix adaptation. In IEEE Congress on Evolutionary Computation – CEC 2006, pages 2814–2821, 2006.
  • [13] I. Loshchilov, M. Schoenauer, and M. Sebag. Black-box Optimization Benchmarking of NIPOP-aCMA-ES and NBIPOP-aCMA-ES on the BBOB-2012 Noiseless Testbed. In GECCO ’2012: Proceedings of the 14th annual conference on Genetic and evolutionary computation, page to appear. ACM, 2012.
  • [14] I. Loshchilov, M. Schoenauer, and M. Sebag. Self-Adaptive Surrogate-Assisted Covariance Matrix Adaptation Evolution Strategy. In GECCO ’2012 Proceedings, page to appear. ACM, 2012.
  • [15] G. Stracquadanio, A. La Ferla, M. De Felice, and G. Nicosia. Design of robust space trajectories. In Research and Development in Intelligent Systems XXVIII, pages 341–354. Springer Verlag, 2011.
  • [16] T. Vinko and D. Izzo. Global Optimisation Heuristics and Test Problems for Preliminary Spacecraft Trajectory Design. Technical Report GOHTPPSTD, European Space Agency, 2008.