Many important real-world problems involve optimizing a submodular function. Such problems include maximum coverage, maximum cut Goemans and Williamson (1995), maximum influence Kempe et al. (2003), sensor placement problem Krause et al. (2008); Krause and Guestrin (2011)
, as well as many problems in the machine learning domainLiu et al. (2013); Wei et al. (2015); Lin and Bilmes (2011, 2010); Stobbe and Krause (2010). Much work has been done in the area of submodular optimization under static constraints. A particularly well-studied class of algorithms in this line of research is greedy algorithms, which have been shown to be efficient in exploiting submodularity Cornuejols et al. (1977); Călinescu et al. (2011); Friedrich et al. (2019); Bian et al. (2017)
. Important recent results on the use of evolutionary algorithms for submodular optimization are summarized inZhou et al. (2019).
Real-world problems are seldom solved once, but rather many times over some period of time, during which they change. Such changes demand adapting the solutions that would otherwise become poor or infeasible. The dynamic nature of these problems presents many interesting optimization challenges, which have long been embraced by many researchers. A lot of research in the evolutionary computation literature has addressed these types of problems from an applied perspective. Theoretical investigations have been carried out for evolutionary algorithms on some example functions and classical combinatorial optimization problems such as shortest paths, but in general the theoretical understand on complex dynamic problems is rather limitedRoostapour et al. (2018b). In this paper, we follow the approach of carrying out theoretical runtime analysis of evolutionary algorithms with respect to their runtime and approximation behavior. This well established area of research has significantly increased the theoretical understanding of evolutionary computation methods Neumann and Witt (2010); Doerr, B., Neumann, F. (Eds.) (2020).
Many recent studies on submodular and near-submodular optimization have investigated Pareto optimization approaches based on evolutionary computation. Qian et al. (2017) derive an approximation guarantee for the POMC algorithm for maximizing monotone function under a monotone constraint. They show that POMC achieves an -approximation within at most cubic expected run time. The recent study of Qian et al. (2019) extends the results to a variant of the GSEMO algorithm (which inspired POMC) to the problem of maximizing general submodular functions, but under a cardinality constraint. It results reveal that non-monotonicity in objective functions worsens approximation guarantees.
In our work, we extend existing results for POMC Roostapour et al. (2019) to partition matroid constraints with dynamic thresholds. We show that the proven adaptation efficiency facilitated by maintaining dominating populations can be extended for multiple constraints with appropriately defined dominance relations. In particular, we prove that POMC can achieve new approximation guarantees quickly whether the constraints thresholds are tightened or relaxed. Additionally, we study POMC experimentally on the dynamic max cut problem and compare its results against the results of greedy algorithms for underlying static problems. Our study evaluates the efficiency in change adaptation, thus assuming immaculate change detection. Our results show that POMC is competitive to GREEDY during unfavorable changes, and outperforming GREEDY otherwise.
In the next section, we formulate the problem and introduce the Pareto optimization approach that is subject to our investigations. Then we analyze the algorithm in terms of runtime and approximation behaviour when dealing with dynamic changes. Finally, we present the results of our experimental investigations and finish with some conclusions.
Problem Formulation and Algorithm
In this study, we consider optimization problems where the objective functions are either submodular or monotone. We use the following definition of submodularity Nemhauser et al. (1978).
Given a finite set , a function is submodular if it satisfies for all and ,
As in many relevant works, we are interested in the submodularity ratio which quantifies how close a function is to being modular. In particular, we use a simplified version of the definition in Das and Kempe (2011).
For a monotone function , its submodularity ratio with respect to two parameters , is
for and .
It can be seen that is non-negative, non-increasing with increasing and , and is submodular iff for all . This ratio also indicates the intensity of the function’s diminishing return effect. Additionally, non-monotonicity is also known to affect worst-case performance of algorithms Friedrich et al. (2019); Qian et al. (2019). As such, we also use the objective function’s monotonicity approximation term defined similarly to Krause et al. (2008), but only for subsets of a certain size.
For a function , its monotonicity approximation term with respect to a parameter is
for and .
It is the case that is non-negative, non-decreasing with increasing , and is monotone iff . We find that adding the size parameter can provide extra insight into the analysis results.
Consider the static optimization problem with partition matroid constraints.
Given a set function , a partitioning of , and a set of integer thresholds , the problem is
We define notations , , and the feasible optimal solution. A solution is feasible iff it satisfies all constraints. It can be shown that , , and any solution where is feasible. Each instance is then uniquely defined by the triplet . Without loss of generality, we assume for all .
We study the dynamic version of the problem in Definition 4. This dynamic problem demands adapting the solutions to changing constraints whenever such changes occur.
Given the problem in Definition 4, a dynamic problem instance is defined by a sequence of changes where the current in each change is replaced by such that for . The problem is to generate a solution that maximizes for each newly given such that
Such problems involve changing constraint thresholds over time. Using the oracle model, we assume time progresses whenever a solution is evaluated. We define notations , , and the new optimal solution . Similarly, we assume for all . Lastly, while restarting from scratch for each new thresholds is a viable tactic for any static problems solver, we focus on the capability of the algorithm to adapt to such changes.
The POMC algorithm Qian et al. (2017) is a Pareto Optimization approach for constrained optimization. It is also known as GSEMO algorithm in the evolutionary computation literature Laumanns et al. (2004); Friedrich et al. (2010); Friedrich and Neumann (2015). As with many other evolutionary algorithms, the binary representation of a set solutions is used. For this algorithm, we reformulate the problem as a bi-objective optimization problem given as
POMC optimizes two objectives simultaneously, using the dominance relation between solutions, which is common in Pareto optimization approaches. Recall that solution dominates () iff and . The dominance relation is strict () iff and for at least one . Intuitively, dominance relation formalizes the notion of “better” solution in multi-objective contexts. Solutions that don’t dominate any other present a trade-off between objectives to be optimized.
The second objective in POMC is typically formulated to promote solutions that are “further” from being infeasible. The intuition is that for those solutions, there is more room for feasible modification, thus having more potential of becoming very good solutions. For the problem of interest, one way of measuring “distance to infeasibility” for some solution is counting the number of elements in that can be added to before it is infeasible. The value then would be , which is the same as in practice. Another way is counting the minimum number of elements in that need to be added to before it is infeasible. The value would then be . The former approach is chosen for simplicity and viability under weaker assumptions about the considered problem.
On the other hand, the first objective aims to present the canonical evolutionary pressure based on objective values. Additionally, also discourages all infeasible solutions, which is different from the formulation in Qian et al. (2017) that allows some degree of infeasibility. This is because for , there can be some infeasible solution where . If is very high, it can dominate many good feasible solutions, and may prevent acceptance of global optimal solutions into the population. Furthermore, restricting to only feasible solutions decreases the maximum population size, which can improve convergence performance. As a consequence, the population size of POMC for our formulation is at most . Our formulation of the two objective functions is identical to the ones in Friedrich and Neumann (2015) when .
POMC (see Algorithm 1) starts with initial population consisting of the search point which represents the empty set. In each iteration, a new solution is generated by random parent selection and bit flip mutation. Then the elitist survivor selection mechanism removes dominated solutions from the population, effectively maintaining a set of trade-off solutions for the given objectives. The algorithm terminates when the number of iteration reaches some predetermined limit. We choose empty set as the initial solution, similar to Qian et al. (2017) and different from Friedrich and Neumann (2015), to simplify the analysis and stabilize theoretical performance. Note that POMC calls the oracle once per iteration to evaluate a new solution, so its run time is identical to the number of iterations.
We assume that changes are made known to the algorithm as they occur, and that feasibility can be checked efficiently. The reason for this is that infeasibility induced by changes in multiple thresholds has nontrivial impact on the algorithm’s behaviour. For single constraint problems, this impact is limited to increases in population size Roostapour et al. (2019), since a solution’s degree of feasibility entirely correlates with its second objective value. However, this is no longer the case for multiple constraints, as the second objective aggregates all constraints. While it reduces the population size as the result, it also allows for possibilities where solutions of small size (high in second objective) become infeasible after a change and thus dominate other feasible solutions of greater cardinality without updating evaluations. This can be circumvented by assuming that the changes’ directions are the same every time, i.e., for every pair. Instead of imposing assumptions on the problems, we only assume scenarios where POMC successfully detects and responds to changes. This allows us to focus our analysis entirely on the algorithm’s adaptation efficiency under arbitrary constraint threshold change scenarios.
For the runtime analysis of POMC for static problems, we refer to the results by Do and Neumann (2020). In short, its worst-case approximation ratios on static problems are comparable to those of the classical GREEDY algorithm, assuming submodularity and weak monotonicity in the objective function Friedrich et al. (2019). On the other hand, a direct comparison in other cases is not straightforward as the bounds involve different sets of parameters.
The strength of POMC in dynamic constraints handling lies in the fact that it stores a good solution for each cardinality level up to . In this way, when changes, the population will contain good solutions for re-optimization. We use the concept of greedy addition to a solution .
It can be shown that for any where , the corresponding greedy addition w.r.t. is the same as the one w.r.t. if , since is still feasible for all . Thus, we can derive the following result from Lemma 2 in Qian et al. (2019), and Lemma 1 in Do and Neumann (2020).
Let be a submodular function, , and be defined in Definition 3, for all such that , we have both
The first part is from Lemma 1 in Do and Neumann (2020), while the second follows since if , then the element contributing the greedy marginal gain to is unchanged. ∎
The result carries over due to satisfied assumption . Using these inequalities, we can construct a proof, following a similar strategy as the one for Theorem 5 in Roostapour et al. (2019) which leads to the following theorem.
For the problem of maximizing a submodular function under partition matroid constraints, assuming , POMC generates a population in expected run time such that
Let be the expression
and be the expression . We have that holds. For each , assume holds and does not hold at some iteration . Let be the solution such that holds, and holds at iteration for any solution such that . This means once holds, it must hold in all subsequent iterations of POMC. Let , Lemma 1 implies
The second inequality uses . This means that holds. Therefore, if is generated, then holds, regardless of whether is dominated afterwards. According to the bit flip procedure, is generated with probability at least which implies
Using the additive drift theorem Lengler (2017), the expected number of iterations until holds is at most . This completes the proof. ∎
The statement (1) implies the following results.
We did not put this more elegant form directly in Theorem 1 since it cannot be used in the subsequent proof; only Expression (1) is applicable. Note that the result also holds for if we change the quantifier to . It implies that when a change such that occurs after cubic run time, POMC is likely to instantly satisfy the new approximation ratio bound, which would have taken it extra cubic run time to achieve if restarted. Therefore, it adapts well in such cases, assuming sufficient run time is allowed between changes. On the other hand, if , the magnitude of the increase affects the difficulty with which the ratio can be maintained. The result also states a ratio bound w.r.t. the new optimum corresponding to the new constraint thresholds. As we will show using this statement, by keeping the current population (while discarding infeasible solutions), POMC can adapt to the new optimum quicker than it can with the restart strategy.
Assuming POMC achieves a population satisfying (1), after the change where , POMC generates in expected time a solution such that
Let be the expression
Hence, holds. It is shown that such a solution is generated by POMC with probability at least . Also, for any solution satisfying , another solution must also satisfy . Therefore, the Additive Drift Theorem implies that given holds for some in the population, POMC generates a solution satisfying in expected time at most . Such a solution satisfies the inequality in the theorem. ∎
The degree with which increases only contributes linearly to the run time bound, which shows efficient adaptation to the new search space.
For the monotone objective functions cases, without loss of generality, we assume that is normalized (). We make use of the following inequalities, derived from Lemma 1 in Qian et al. (2016), and Lemma 2 in Do and Neumann (2020), using the same insight as before.
Let be a monotone function, , and be defined in Definition 2, for all such that , we have both
The first part is from Lemma 2 in Do and Neumann (2020), while the second follows since if , then the element contributing the greedy marginal gain to is unchanged. ∎
This leads to the following result which show POMC’s capability in adapting to changes where .
For the problem of maximizing a monotone function under partition matroid constraints, assuming , POMC generates a population in expected run time such that
Just like Theorem 1, this result holds for when the quantifier is . It is implied that if such a population is made, the new approximation ratio bound is immediately satisfied when the thresholds decrease. Once again, we can derive the result on POMC’s adapting performance to by using Theorem 3.
Assuming POMC achieves a population satisfying (2), after the change where , POMC generates in expected time a solution such that
Let be a expression
So holds. Following the same reasoning in the proof for Theorem 2, we get that POMC, given a solution in the population satisfying , generates another solution satisfying in expected run time at most . ∎
Theorem 2 and Theorem 4 state that there is at least one solution in the population that satisfies the respective approximation ratio bounds in each case, after POMC is run for at least some expected number of iterations. However, the proofs for Theorem 1 and Theorem 3 also imply that the same process generates, for each cardinality level from to , a solution that also satisfies the respective approximation bound ratio, adjusted to the appropriate cardinality term. These results imply that instead of restarting, maintaining the non-dominated population provides a better starting point to recover relative approximation quality in any case. This suggests that the inherent diversity in Pareto-optimal fronts is suitable as preparation for changes in constraint thresholds.
We compare the POMC algorithm against GREEDY Friedrich et al. (2019) with the restart approach, on undirected max cut problems with random graphs. Given a graph where is a non-negative edge weight function, the goal is to find, while subjected to changing partition matroid constraints,
Weighted graphs are generated for the experiments based on two parameters: number of vertices () and edge density. Here, we consider , and 5 density values: 0.01, 0.02, 0.05, 0.1, 0.2. For each - pair, a different weighted graph is randomly generated with the following procedure:
Randomly sample from without replacement, until .
Assign to a uniformly random value in for each .
Assign for all .
To limit the variable dimensions so that the results are easier to interpret, we only use one randomly chosen graph for each -density pair. We also consider different numbers of partitions : 1, 2, 5, 10. For , each element is assigned to a partition randomly. Also, the sizes of the partitions are equal, as with the corresponding constraint thresholds.
The dynamic component is the thresholds . We use the approach outlined in Roostapour et al. (2018a) to generate threshold changes in the form of a sequence of values . In particular, these values range in and are applied to each instance: at the change, rounded to the nearest integer. The generating formulas are as follow:
The values used in the experiments are displayed in Figure 1 where the number of changes is .
For each change, the output of GREEDY is obtained by running without restriction on evaluations. These values are used as baselines to compare against POMC’s outputs . POMC is run with three different settings for the number of evaluations between changes: , ,
. Smaller numbers imply a higher change frequency leading to a higher difficulty in adapting to changes. Furthermore, POMC is run 30 times for each setting, and means and standard deviations of results are shown in POMC, POMC, POMC, corresponding to different change frequency settings.
with 95% confidence interval is used to determine statistical significance in each change. The results show that outputs from both algorithms are very closely matched most of the time, with the greatest differences observed in low graph densities. Furthermore, we expect that GREEDY fairs better against POMC when the search space is small (low threshold levels). While this is observable, the opposite phenomenon when the threshold level is high can be seen more easily.
We see that during consecutive periods of high constraint thresholds, POMC’s outputs initially fall behind GREEDY’s, only to overtake them at later changes. This suggests that POMC rarely compromises its best solutions during those periods as the consequence of the symmetric submodular objective function. It also implies that restarting POMC from scratch upon a change would have resulted in significantly poorer results. On the other hand, POMC’s best solutions follow GREEDY’s closely during low constraint thresholds periods. This indicates that by maintaining feasible solutions upon changes, POMC keeps up with GREEDY in best objectives well within quadratic run time.
Comparing outputs from POMC with different interval settings, we see that those from runs with higher number of evaluations between changes are always better. However, the differences are minimal during the low constraint thresholds periods. This aligns with our theoretical results in the sense that the expected number of evaluations needed to guarantee good approximations depends on the constraint thresholds. As such, additional evaluations won’t yield significant improvements within such small feasible spaces.
Comparing between different values, POMC seems to be at a disadvantage against GREEDY’s best at increased . This is expected since more partitions leads to more restrictive feasible search spaces, given everything else is unchanged, and small feasible spaces amplify the benefit of each greedy step. Nevertheless, POMC does not seem to fall behind GREEDY significantly for any long period, even when given few resources.
In this study, we have considered combinatorial problems with dynamic constraint thresholds, particularly the important classes of problems where the objective functions are submodular or monotone. We have contributed to the theoretical run time analysis of a Pareto optimization approach on such problems. Our results indicate POMC’s capability of maintaining populations to efficiently adapt to changes and preserve good approximations. In our experiments, we have shown that POMC is able to maintain at least the greedy level of quality and often even obtains better solutions.
This work has been supported by the Australian Research Council through grants DP160102401 and DP190103894.
- Guarantees for greedy maximization of non-submodular functions with applications. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pp. 498––507. Cited by: Introduction.
- Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing 40 (6), pp. 1740–1766. External Links: Cited by: Introduction.
- Nonparametric statistics for non-statisticians: A step-by-step approach. Wiley. Note: Hardcover External Links: Cited by: Experimental Investigations.
- Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Management Science 23 (8), pp. 789–810. External Links: Cited by: Introduction.
- Submodular meets spectral: greedy algorithms for subset selection, sparse approximation and dictionary selection. In Proceedings of the 28th International Conference on Machine Learning, ICML’11, Madison, WI, USA, pp. 1057–1064. External Links: Cited by: Problem Formulation and Algorithm.
- Maximizing submodular or monotone functions under partition matroid constraints by multi-objective evolutionary algorithms. In Parallel Problem Solving from Nature – PPSN XVI, pp. 588–603. External Links: Cited by: Runtime Analysis, Runtime Analysis, Runtime Analysis, Runtime Analysis, Runtime Analysis.
- Theory of evolutionary computation – recent developments in discrete optimization. Natural Computing Series, Springer. External Links: Cited by: Introduction.
Greedy maximization of functions with bounded curvature under partition matroid constraints.
Proceedings of the AAAI Conference on Artificial Intelligence33, pp. 2272–2279. External Links: Cited by: Introduction, Problem Formulation and Algorithm, Runtime Analysis, Experimental Investigations.
Approximating covering problems by randomized search heuristics using multi-objective models. Evolutionary Computation 18 (4), pp. 617––633. External Links: Cited by: POMC algorithm.
- Maximizing submodular functions under matroid constraints by evolutionary algorithms. Evolutionary Computation 23 (4), pp. 543–558. External Links: Cited by: POMC algorithm, POMC algorithm, POMC algorithm.
- Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM 42 (6), pp. 1115––1145. External Links: Cited by: Introduction.
- Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, New York, NY, USA, pp. 137–146. External Links: Cited by: Introduction.
- Submodularity and its applications in optimized information gathering. ACM Transactions on Intelligent Systems and Technology 2 (4), pp. 32:1–32:20. External Links: Cited by: Introduction.
- Near-optimal sensor placements in gaussian processes: theory, efficient algorithms and empirical studies. Journal of Machine Learning Research 9, pp. 235–284. External Links: Cited by: Introduction, Problem Formulation and Algorithm.
- Running time analysis of multiobjective evolutionary algorithms on pseudo-boolean functions. IEEE Transactions on Evolutionary Computation 8 (2), pp. 170–182. External Links: Cited by: POMC algorithm.
- Drift analysis. CoRR abs/1712.00964. External Links: Cited by: Runtime Analysis.
Multi-document summarization via budgeted maximization of submodular functions. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, USA, pp. 912––920. External Links: Cited by: Introduction.
- A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT ’11, Portland, Oregon, USA, pp. 510–520. Cited by: Introduction.
Submodular feature selection for high-dimensional acoustic score spaces. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. , pp. 7184–7188. External Links: Cited by: Introduction.
- An analysis of approximations for maximizing submodular set functions—i. Mathematical Programming 14 (1), pp. 265–294. External Links: Cited by: Problem Formulation and Algorithm.
- Bioinspired computation in combinatorial optimization. Natural Computing Series, Springer. External Links: Cited by: Introduction.
- Parallel pareto optimization for subset selection. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16, pp. 1939–1945. External Links: Cited by: Runtime Analysis.
- On subset selection with general cost constraints. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2613–2619. External Links: Cited by: Introduction, POMC algorithm, POMC algorithm, POMC algorithm.
- Maximizing submodular or monotone approximately submodular functions by multi-objective evolutionary algorithms. Artificial Intelligence 275, pp. 279–294. External Links: Cited by: Introduction, Problem Formulation and Algorithm, Runtime Analysis.
- Pareto optimization for subset selection with dynamic cost constraints. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 2354–2361. External Links: Cited by: Introduction, POMC algorithm, Runtime Analysis.
- On the performance of baseline evolutionary algorithms on the dynamic knapsack problem. In Parallel Problem Solving from Nature – PPSN XV, A. Auger, C. M. Fonseca, N. Lourenço, P. Machado, L. Paquete, and D. Whitley (Eds.), Cham, pp. 158–169. External Links: Cited by: Experimental Investigations.
- Analysis of evolutionary algorithms in dynamic and stochastic environments. CoRR abs/1806.08547. Cited by: Introduction.
- Efficient minimization of decomposable submodular functions. In Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 2, NIPS ’10, USA, pp. 2208–2216. Cited by: Introduction.
Submodularity in data subset selection and active learning. In Proceedings of the 32nd International Conference on Machine Learning - Volume 37, ICML ’15, pp. 1954–1963. Cited by: Introduction.
- Evolutionary learning: advances in theories and algorithms. Springer. External Links: Cited by: Introduction.