. Important results have been obtained for a wide range of benchmark functions as well as for important combinatorial optimization problems
. This includes a wide range of evolutionary computing methods in a wide range of deterministic, stochastic and dynamic settings. We refer the reader to for a presentation of important recent research results.
Many important real-world problems can be stated in terms of optimizing a submodular function and the analysis of evolutionary algorithms using multi-objective formulations has shown that they obtain in many cases the best possible performance guarantee (unless P=NP). Important recent results on the use of evolutionary algorithms for submodular optimization are summarized in . The goal of this paper is to expand the investigations of evolutionary multi-objective optimization for submodular optimization. While previous investigations mainly concentrated on monotone submodular functions with a single constraint, we consider non-monotone submodular functions with a set of constraints.
1.1 Related work
Submodular functions are considered the discrete counterparts of convex functions . Submodularity captures the notion of diminishing marginal return, and is present in many important problems. While minimizing submodular functions can be done using a polynomial time combinatorial algorithm , submodular maximization encompasses many NP-hard combinatorial problems such as maximum coverage, maximum cut , maximum influence , and sensor placement problem [24, 23]
. It is also applied in many problems in machine learning domain[30, 41, 29, 28, 38]. Considering the role of evolutionary algorithms in difficult optimization problems, we focus on submodular maximization.
Realistic optimization problems often impose constraints on the solutions. In applications of submodular maximization, Matroid and Knapsack constraints are among the most common . In this work, we consider submodular maximization under partition matroid constraints, which are a generalization of cardinality constraints. This type of constraint has been considered in a variety of applications [21, 7, 14].
A greedy algorithm has been shown to achieve -approximation ratio in maximizing monotone submodular functions under partition matroid constraints . It was later proven that is the best approximation ratio a polynomial time algorithm can guarantee. A more recent study  proposed a randomized algorithm that achieves this ratio. Another study 
analyzes derandomizing search heuristics, leading to a deterministic-approximation ratio.
Additionally, more nuanced results have been reached when limiting objective functions to those with finite rate of marginal change, quantified by curvature as defined in . The results in [8, 40] indicate that -approximation ratio is achievable by the continuous greedy algorithm in maximizing monotone submodular functions under a matroid constraint. A more recent study  proved -approximation ratio for the deterministic greedy algorithm in maximizing functions with submodularity ratio , under a cardinality constraint.
These results rely on the assumption of monotonicity of the objective functions, for all , which do not hold in many applications of submodular maximization. A study  derives approximation guarantees for GSEMO algorithm in maximizing monotone and symmetric submodular function under a matroid constraint, which suggest that non-monotone functions are harder to maximize. This is supported by another result  for a greedy algorithm in maximizing general submodular function under partition matroid constraints. A recent study  extends the results for a GSEMO variant to the problems of maximizing general submodular functions, but under a cardinality constraint.
1.2 Our contribution
In this work, we contribute to the theoretical analysis of GSEMO by generalizing previous results to partition matroid constraints. Firstly, we provide an approximation guarantee for GSEMO in maximizing general submodular functions under partition matroid constraints (Theorem 1). Secondly, we derive another result for monotone and not necessarily submodular functions, under the same constraints (Theorem 2), to account for other important types of function like subadditive functions. Subadditivity encompasses submodularity, and is defined by the property where the whole is no greater than the sum of parts. Subadditive functions are commonly used to model items evaluations and social welfare in combinatorial auctions [2, 3, 39, 15]. Our results extend the existing ones  with more refined bounds.
We investigate GSEMO’s performance against GREEDY’s  in maximizing undirected cuts in random graphs under varying cardinality constraints and partition matroid constraints. Graph cut functions with respect to vertices sets are known to be submodular and non-monotone . In particular, they are also symmetric for undirected graphs . Our results suggest that GSEMO typically requires more evaluations to reach GREEDY’s outputs quality. Nonetheless, GSEMO surpasses GREEDY shortly after the latter stops improving, indicating the former’s capacity for exploring the search spaces. Predictably, GSEMO outperforms GREEDY more reliably in larger search spaces.
In this section, we provide a formal definition of the problem and some of its parameters relevant to our analyses. We also describe the simple GREEDY algorithm and the GSEMO algorithm considered in this work.
2.1 Problem definition
We consider the following problem. Let be a non-negative function over a set of size , be a partition of for some , be integers such that for all , the problem is finding maximizing , subject to
These constraints are referred to as partition matroid constraints, which are equivalent to a cardinality constraint if . The objective function of interest is submodular, meaning it satisfies the property as defined in 
A function is submodular if
We can assume that is not non-increasing, and for monotone , we can assume . To perform analysis, we define the function’s monotonicity approximation term similar to , but only for subsets of a certain size
For a function , its monotonicity approximation term with respect to a parameter is
for and .
It is clear that is non-negative, non-decreasing with increasing , and is monotone iff . Additionally, for monotone non-submodular , we use submodularity ratio which quantifies how close is to being modular. In particular, we simplify the definition  which measures the severity of the diminishing return effect.
For a monotone function , its submodularity ratio with respect to two parameters , is
for and .
It can be seen that is non-negative, non-increasing with increasing and , and is submodular iff for all .
For the purpose of analysis, we denote , , and the optimal solution; we have and . We evaluate the algorithm’s performance via where is the algorithm’s output. Furthermore, we use the black-box oracle model to evaluate run time, hence our results are based on numbers of oracle calls.
2.2 Algorithms descriptions
A popular baseline method to solve hard problems is greedy heuristics. A simple deterministic GREEDY variant has been studied for this problem . It starts with the empty solution, and in each iteration adds the feasible remaining element in that increases value the most. It terminates when there is no remaining feasible elements that yield positive gains. This algorithm extends the GREEDY algorithms in  to partition matroids constraints. Note that at iteration , GREEDY calls the oracle times, so its run time is . According to , it achieves approximation ratio when is submodular, and approximation ratio when is non-decreasing subadditive, with being the curvature of .
On the other hand, GSEMO [25, 16, 17], also known as POMC , is a well-known simple Pareto optimization approach for constrained single-objective optimization problems. It has been shown to outperform the generalized greedy algorithm in overcoming local optima . To use GSEMO with partition matroid constraints, the problem is reformulated as a bi-objective problem
where , .
GSEMO optimizes two objectives simultaneously, using the dominance relation between solutions, which is common in Pareto optimization approaches. By definition, solution dominates () iff and . The dominance relation is strict () iff or . Intuitively, dominance relation formalizes the notion of “better” solution in multi-objective contexts. Solutions that don’t dominate any other present a trade-off between objectives to be optimized.
The second objective in GSEMO is typically formulated to promote solutions that are “further” from being infeasible. The intuition is that for those solutions, there is more room for feasible modification, thus having more potential of becoming very good solutions. For the problem of interest, one way of measuring “distance to infeasibility” for some solution is counting the number of elements in that can be added to before it is infeasible. The value then would be , which is the same as in practice. Another way is counting the minimum number of elements in that need to be added to before it is infeasible. The value would then be . The former approach is chosen for simplicity and viability under weaker assumption about the oracle.
On the other hand, the first objective aims to present the canonical evolutionary pressure based on objective values. Additionally, also discourages all infeasible solutions, which is different from the formulation in  that allows some degree of infeasibility. This is because for , there can be some infeasible solution where . If is very high, it can dominate many good feasible solutions, and may prevent acceptance of global optimal solutions into the population. Furthermore, restricting to only feasible solutions decreases the maximum population size, which can improve convergence performance. It is clear the the population size is at most . These formulations of the two objective functions are identical to the ones in  when .
In practice, set solutions are represented in GSEMO as binary sequences, where with the following bijective mapping is implicitly assumed
This representation of set is useful in evolutionary algorithms since genetic bit operators are compatible. GSEMO operates on the bit sequences, and the fitness function is effectively a pseudo-Boolean function . It starts with initial population of a single empty solution. In each iteration, a new solution is generated by random parent selection and bit flip mutation. Then the elitist survival selection mechanism removes dominated solutions from the population, effectively maintaining a set of known Pareto-optimal solutions. The algorithm terminates when the number of iteration reaches some predetermined limit. The procedure is described in Algorithm 1. We choose empty set as the initial solution, similar to  and different from , to simplify the analysis and stabilize theoretical performance. Note that GSEMO calls the oracle once per iteration to evaluate a new solution, so its run time is identical to the number of iterations.
3 Approximation guarantees
We derive an approximation guarantee for GSEMO on maximizing a general submodular function under partition matroid constraints. According to the analysis for GREEDY , we can assume there are “dummy” elements with zero marginal contribution. For all feasible solutions where , let be the feasible greedy addition to , we can derive the following result from Lemma 2 in , using .
Lemma 1 ().
Let be a submodular function and be defined in Definition 2, for all feasible solutions such that
With this lemma, we can prove the following result where denotes the population at iteration .
For the problem of maximizing a submodular function under partition matroid constraints, GSEMO generates in expected run time a solution such that
Let be a statement , and , it is clear that holds, so and is well-defined for all since the empty solution is never dominated.
Assuming at some , let such that holds. If is not dominated and removed from , then . Otherwise, there must be some such that and . This implies , so . Therefore, is never decreased as progresses. Let , Lemma 1 implies
The second inequality uses . The probability that GSEMO selects is at least , and the probability of generating by mutating is at least . Furthermore, holds as shown, so if . Since and are chosen arbitrarily, this means
Therefore, the Additive Drift Theorem  implies the expected number of iterations for to reach from is at most . When , must contain a feasible solution such that
Therefore, GSEMO generates such a solution in expected run time at most . ∎
In case of a single cardinality constraint (), this approximation guarantee is at least as tight as the one for GSEMO-C in . If monotonicity of is further assumed, the result is equivalent to the one for GSEMO in . Additionally, the presence of suggests that the non-monotonicity of does not necessarily worsen the approximation guarantee when negative marginal gains are absent from all GSEMO’s possible solutions (i.e. cannot decrease objective values by adding an element).
As an extension beyond submodularity instead of monotonicity, we provide another proof of the approximation guarantee for GSEMO on the problems of maximizing monotone functions under the same constraints. Without loss of generality, we can assume that is normalized, meaning . We make use of the following inequality, derived from Lemma 1 in .
Let be a monotone function and be defined in Definition 3, for all feasible solutions such that
Using this lemma, we similarly prove the following result.
For the problem of maximizing a monotone function under partition matroid constraints, GSEMO with expected run time generates a solution such that
Let be a statement , and , it is clear that holds, so and is well-defined for all since the empty solution is never dominated.
Assuming at some , there must be such that holds. If is not dominated and removed from , then . Otherwise, there must be some such that and . This implies , so . Therefore, is never decreased as progresses. Let , Lemma 2 implies
The second inequality uses .The probability that GSEMO selects is at least , and the probability of generating by mutating is at least . Furthermore, holds as shown, so . Since and are chosen arbitrarily, this means
Therefore, according to the Additive Drift Theorem , the expected number of iterations for to reach from is at most . When , must contain a feasible solution such that
Therefore, GSEMO generates such a solution in expected run time at most . ∎
Compared to the results in , it is reasonable to assume that restricting GSEMO’s population to only allow feasible solutions improves worst-case guarantees. However, it also eliminates the possibility of efficient improvement by modifying infeasible solutions that are very close to very good feasible ones. This might reduce its capacity to overcome local optima.
4 Experimental investigations
We compare GSEMO and GREEDY on the symmetric submodular Cut maximization problems with randomly generated graphs under varying settings. The experiments are separated into two groups: cardinality constraints () and general partition matroid constraints ().
4.1 Max Cut problems setup
Weighted graphs are generated for the experiments based on two parameters: number of vertices (which is ) and density. There are 3 values for : 50, 100, 200. There are 5 density values: 0.01, 0.02, 0.05, 0.1, 0.2. For each - pair, 30 different weighted graphs – each denoted as – are generated with the following procedure:
Randomly sample from without replacement, until .
Assign to a uniformly random value in for each .
Assign for all .
Each graph is then paired with different sets of constraints, and each pairing constitutes a problem instance. This enables observations of changes in outputs on the same graphs under varying constraints. For cardinality constraints, , rounded to the nearest integer. Thus, there are 30 problem instances per -density- triplet. For partition matroid constraints, the numbers of partitions are . The partitions are of the same size, and each element is randomly assigned to a partition. The thresholds are all set to since the objective functions are symmetric. Likewise, there are 30 problem instances per -density- triplet.
GSEMO is run on each instance 30 times, and the minimum, mean and maximum results are denoted by GSEMO, GSEMO and GSEMO, respectively. The GREEDY algorithm is run until satisfying its stopping condition, while GSEMO is run for iterations. Their final achieved objective values are then recorded and analyzed. Note that the run time budget for GSEMO in every setting is smaller than the theoretical bound on average run time in Theorem 1, except for where it is only slightly larger.
4.2 Cut maximization under a cardinality constraint
The experimental results for cardinality constraint cases are shown in Table 1. Signed-rank U-tests  are applied to the outputs, with pairing based on instances. Furthermore, we count the numbers of instances where GSEMO outperforms, ties with, and is outperformed by GREEDY via separate U-tests on individual instances.
Overall, GSEMO on average outperforms GREEDY with statistical significance in most cases. The significance seems to increase, with some noises, with increasing , density, and . This indicates that GSEMO more reliably produces better solutions than GREEDY as the graph’s size and density increase. Moreover, in few cases with large , GSEMO is higher than GREEDY’s with statistical significance.
Additionally, it is indicated that GSEMO tend to be closer to GSEMO than GSEMO
. This suggests skewed distribution of outputs in each instance toward high values. The implication is that GSEMO is more likely to produce outputs greater than average, than otherwise. It might be an indication that these results are close to the global optima for these instances.
Per instance analyses show high number of ties between GSEMO and GREEDY for small , density, and to a lesser extent . As these increase, the number of GSEMO’s wins increases and ends up dominating at . This trend coincides with earlier observations, and suggests the difficulty of making improvements in sparse graphs faced by GSEMO where GREEDY heuristic seems more suited. On the other hand, large graph sizes seem to favour GSEMO over GREEDY despite high sparsity, likely due to more local optima present in larger search spaces.
4.3 Cut maximization under partition matroid constraints
Overall, the main trend in cardinality constraint cases is present: GSEMO on average outperforms GREEDY, with increased reliability at larger and density. This can be observed in both the average performances and the frequency at which GSEMO beats GREEDY. It seems the effect of this phenomenon is less influenced by variations in than it is by variations in in cardinality constraint cases. Note that the analysis in Theorem 1 only considers bottom-up improvements up to . Experimental results likely suggest GSEMO can make similar improvements beyond that point up to .
Additionally, the outputs of both algorithms are generally decreased with increased , due to restricted feasible solution spaces. There are few exceptions which could be attributed to noises from random partitioning since they occur in both GREEDY’s and GSEMO’s simultaneously. Furthermore, the variation in seems to slightly affect the gaps between GSEMO’s and GREEDY’s, which are somewhat smaller at higher . It seems to support the notion that GREEDY performs well in small search spaces while GSEMO excels in large search spaces. This coincides with the observations in the cardinality constraint cases, and explains the main trend.
In this work, we consider the problem of maximizing a set function under partition matroid constraints, and analyze GSEMO’s approximation performance on such problems. Theoretical performance guarantees are derived for GSEMO in cases of submodular objective functions, and monotone objective functions. We show that GSEMO guarantees good approximation quality within polynomial expected run time in both cases. Additionally, experiments with Max Cut instances generated from varying settings have been conducted to gain insight on its empirical performance, based on comparison against simple GREEDY’s. The results show that GSEMO generally outperforms GREEDY within quadratic run time, particularly when the feasible solution space is large.
The experiments were run using the HPC service provided by the University of Adelaide.
-  (2011) Theory of randomized search heuristics: foundations and recent developments. World Scientific Publishing Co., Inc.. Cited by: §1.
-  (2008) Item pricing for revenue maximization. In Proceedings of the 9th ACM Conference on Electronic Commerce, EC ’08, New York, NY, USA, pp. 50–59. External Links: Cited by: §1.2.
-  (2011-10) Welfare guarantees for combinatorial auctions with item bidding. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’11, Philadelphia, PA, USA, pp. 700–709. External Links: Cited by: §1.2.
-  (2017) Guarantees for greedy maximization of non-submodular functions with applications. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pp. 498––507. Cited by: §1.1.
-  (2019) Deterministic -approximation for submodular maximization over a matroid. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’19, Philadelphia, PA, USA, pp. 241–254. Cited by: §1.1.
-  (2011-12) Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing 40 (6), pp. 1740–1766. External Links: Cited by: §1.1.
-  (2004) Maximum coverage problem with group budget constraints and applications. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, K. Jansen, S. Khanna, J. D. P. Rolim, and D. Ron (Eds.), Berlin, Heidelberg, pp. 72–83. Cited by: §1.1.
-  (1984) Submodular set functions, matroids and the greedy algorithm: tight worst-case bounds and some generalizations of the rado-edmonds theorem. Discrete Applied Mathematics 7 (3), pp. 251–274. External Links: Cited by: §1.1.
-  (2009) Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Wiley. Note: Hardcover External Links: Cited by: §4.2.
-  (1977-04) Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Management Science 23 (8), pp. 789–810. External Links: Cited by: §1.1.
-  (2011) Submodular meets spectral: greedy algorithms for subset selection, sparse approximation and dictionary selection. In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11, Madison, WI, USA, pp. 1057–1064. External Links: Cited by: §2.1.
-  (2020) Theory of evolutionary computation – recent developments in discrete optimization. Natural Computing Series, Springer. External Links: Cited by: §1.
-  (2011-07) Maximizing non-monotone submodular functions. SIAM Journal on Computing 40 (4), pp. 1133–1153. External Links: Cited by: §1.2.
-  (2006-01) Tight approximation algorithms for maximum general assignment problems. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm, SODA ’06, Philadelphia, PA, USA, pp. 611–620. External Links: Cited by: §1.1.
Greedy maximization of functions with bounded curvature under partition matroid constraints.
Proceedings of the AAAI Conference on Artificial Intelligence33, pp. 2272–2279. External Links: Cited by: §1.1, §1.1, §1.2, §1.2, §2.2, §3.
-  (2010-12) Approximating covering problems by randomized search heuristics using multi-objective models*. Evolutionary Computation 18 (4), pp. 617––633. External Links: Cited by: §2.2.
-  (2015-12) Maximizing submodular functions under matroid constraints by evolutionary algorithms. Evolutionary Computation 23 (4), pp. 543–558. External Links: Cited by: §1.1, §2.2, §2.2, §2.2, §3.
-  (1995-11) Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM 42 (6), pp. 1115––1145. External Links: Cited by: §1.1.
-  (2001-07) A combinatorial strongly polynomial algorithm for minimizing submodular functions. Journal of the ACM 48 (4), pp. 761–777. External Links: Cited by: §1.1.
-  (2013) Analyzing evolutionary algorithms - the computer science perspective. Natural Computing Series, Springer. External Links: Cited by: §1.
-  (2011-06) Submodularity beyond submodular energies: coupling edges in graph cuts. In CVPR 2011, CVPR ’11, Vol. , pp. 1897–1904. External Links: Cited by: §1.1.
-  (2003) Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, New York, NY, USA, pp. 137–146. External Links: Cited by: §1.1.
-  (2011-07) Submodularity and its applications in optimized information gathering. ACM Transactions on Intelligent Systems and Technology 2 (4), pp. 32:1–32:20. External Links: Cited by: §1.1.
-  (2008-02) Near-optimal sensor placements in gaussian processes: theory, efficient algorithms and empirical studies. Journal of Machine Learning Research 9, pp. 235–284. External Links: Cited by: §1.1, §2.1.
-  (2004-04) Running time analysis of multiobjective evolutionary algorithms on pseudo-boolean functions. IEEE Transactions on Evolutionary Computation 8 (2), pp. 170–182. External Links: Cited by: §2.2.
Non-monotone submodular maximization under matroid and knapsack constraints.
Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC ’09, New York, NY, USA, pp. 323–332. External Links: Cited by: §1.1.
-  (2017) Drift analysis. CoRR abs/1712.00964. External Links: Cited by: §3, §3.
Multi-document summarization via budgeted maximization of submodular functions. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, USA, pp. 912––920. External Links: Cited by: §1.1.
-  (2011-06) A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT ’11, Portland, Oregon, USA, pp. 510–520. Cited by: §1.1.
Submodular feature selection for high-dimensional acoustic score spaces. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. , pp. 7184–7188. External Links: Cited by: §1.1.
-  (1983) Submodular functions and convexity. In Mathematical Programming The State of the Art: Bonn 1982, A. Bachem, B. Korte, and M. Grötschel (Eds.), pp. 235–257. External Links: Cited by: §1.1.
-  (1978-08) Best algorithms for approximating the maximum of a submodular set function. Mathematics of Operations Research 3 (3), pp. 177–188. External Links: Cited by: §2.1, §2.2.
-  (2010) Bioinspired computation in combinatorial optimization. Natural Computing Series, Springer. External Links: Cited by: §1.
-  (2016) Parallel pareto optimization for subset selection. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16, pp. 1939–1945. External Links: Cited by: §3.
-  (2017-08) On subset selection with general cost constraints. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2613–2619. External Links: Cited by: §2.2, §2.2, §2.2, §3.
-  (2019) Maximizing submodular or monotone approximately submodular functions by multi-objective evolutionary algorithms. Artificial Intelligence 275, pp. 279–294. External Links: Cited by: §1.1, §1.2, §3, §3, Lemma 1.
-  (1995) A combinatorial algorithm for minimizing symmetric submodular functions. In Proceedings of the 6th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’95, USA, pp. 98–101. External Links: Cited by: §1.2.
-  (2010-10) Efficient minimization of decomposable submodular functions. In Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 2, NIPS ’10, USA, pp. 2208–2216. Cited by: §1.1.
-  (2013) Composable and efficient mechanisms. In Proceedings of the 45th Annual ACM Symposium on Theory of Computing, STOC ’13, New York, NY, USA, pp. 211–220. External Links: Cited by: §1.2.
-  (2010-01) Submodularity and curvature: the optimal algorithm. RIMS Kôkyûroku Bessatsu B23, pp. 253–266. Cited by: §1.1.
Submodularity in data subset selection and active learning. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML ’15, pp. 1954–1963. Cited by: §1.1.
-  (2019) Evolutionary learning: advances in theories and algorithms. Springer. External Links: Cited by: §1.